Wandering Thoughts archives

2009-06-22

Solaris 10 NFS server parameters that we change and why

One of the ways that Solaris does not make me happy is that they do not seem to have changed various system defaults since, oh, 1996, when machines were much smaller than they are now. As a result, we have accumulated a set of NFS server parameters that we have had to change in order to get decent performance and functionality.

(This set is not particularly novel, which is part of the irritation; pretty much everyone winds up making many of these changes sooner or later. But instead of the system shipping with sensible defaults, you are left to discover them on your own, or not discover them and wonder why your theoretically powerful and modern Solaris NFS server is performing pathetically badly. Or why it is exploding.)

Unless mentioned otherwise, all of these parameters are set (or changed, really) in /etc/default/nfs:

  • NFSD_SERVERS, from 16 to 512

    The maximum number of concurrent NFS requests. The default is too low to get decent performance under load, and has been for years. This is one of the standard tuneables that everyone says you should change, but beware; the usual advice on the Internet is to set it to 1024, but on our fileservers having that many NFS server threads locked up my test system (running on reasonably beefy hardware).

    (Apparently the NFS server threads are very high priority, and if you have too many of them they will happily eat all of your CPU.)

  • LOCKD_SERVERS, from 20 to 128
    LOCKD_LISTEN_BACKLOG, from 32 to 256

    The maximum number of simultaneous NFS lock requests. We saw NFS locking failures under production load that were cured by doing this. I believe that LOCKD_SERVERS is the important one, but we haven't tested this.

  • NFS_SERVER_VERSMAX, from 4 to 3

    The maximum NFS protocol version that the server will use.

    We're wimps. NFS v4 is peculiar and we've never tested it, and I have no desire to find out all the ways that Linux and Solaris don't get along about it. So even if machines think that they're capable of doing it, we don't want them to.

  • set nfssrv:nfs_portmon = 1, which is set in /etc/system.

    Require NFS requests to come from reserved ports. In theory you might be able to change this on a live system with mdb -kw, but really, just schedule a reboot.

As a cautionary note on Solaris 10 x86, remember to update the boot archive with 'bootadm update-archive' every time you change /etc/system. I don't think that changing /etc/default/nfs requires updating the boot archive, but it can't hurt to run the command anyways.

Necessary disclaimer: these work for us but may not work for you. Always test your system.

solaris/SolarisNFSServerTuning written at 23:57:18; Add Comment

Why GNU tools are sometimes not my favorite programs

Presented in the traditional illustrated form:

; cat a
root
cks
; cat b
root
cks
abc
; comm -13 a b >/dev/null
comm: file 2 is not in sorted order

If your comm doesn't do this, don't be surprised; this behavior is new in the very latest version of comm, from coreutils 7.2 (as installed on Fedora 11; Fedora 10 didn't have it). This behavior is turned off by the new --nocheck-order option, although the manpage contains scary warnings about this not being supported.

Congratulations, GNU coreutils maintainers. You have just broken any number of scripts that were using comm to obtain differences between ordered files; all of these scripts now produce extra output, which is bad. Worse, fixing this will make the scripts unportable, since not even previous versions of GNU comm understand the new --nocheck-order option.

(Yes, yes, technically this behavior is allowed by the Single Unix Specification. But in real life, this is false; the true specification is not whatever is allowed by the letter of standards, it is what everything does and what people write to.)

Also, this is utterly the wrong way to change behavior like this. The correct way is to first introduce the necessary command line switches but not default to emitting a warning, with a note that in X amount of time the default will change. Then several versions later you can start to think about changing the default, since people have had a chance to add the new options to their scripts. (You will still fail, because people don't even look at perfectly working scripts, much less update them, but at least you will have made vague motions towards doing the right thing instead of being an asshole.)

sysadmin/GnuCommMisfeature written at 00:30:37; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.