We are looking into some unusual NFS behavior changes as we migrate from a
2.6.8 kernel to a 2.6.24 kernel.
We have two server farms, each having a few dozen machines accessing a
central NAS via NFS. For testing purposes we have a 2.6.8 machine and a
2.6.24 machine both running webmail, therefore making frequent NFS access
to small files. When we look at the traffic from the 2.6.8 kernel we see
what we consider to be a reasonably low amount of NFS traffic. The traffic
also follows what appears to be a reasonable pattern: LOOKUP calls, ACCESS
calls, READDIRPLUS calls, an occasional GETATTR call. I'd estimate that
GETATTR calls make up 10-15% of the total NFS traffic.
The same webmail app running on the 2.6.24 kernel generates a lot more NFS
traffic, and it's not nearly as intuitive a pattern: dozens of GETATTR
calls, on occasional LOOKUP, ACCESS, REMOVE, etc. I'd estimate that the
GETATTR calls account for easily 90% of the total NFS traffic. Machines
running this new kernel are placing a much higher load on the NAS and the
internal network the NAS runs on than the machines running the older
kernel do.
We've reviewed the kernel change logs and noted a few comments on minor
changes in the NFS code, but we haven't seen any comments that seem to
explain this kind of change in NFS performance. Can anyone point me to a
source of info on this? We'd love to migrate all the farm machines to the
newer kernel, but until we get a handle on the change in NFS behavior we
can't really move forward.