A performance gotcha with syslogd

August 3, 2008

Stated simply: many versions of syslogd will fsync() logfiles after writing a messages to them, in an attempt to make sure that the message makes it to disk in case something happens to the system immediately afterwards (crashes, loses power, etc). This obviously can have an impact (sometimes a significant one) on any other IO activity going on at the time.

On some but not all systems with this feature, you can this off for specific syslog files by sticking a '-' in front of them; this is especially handy for high volume, low importance log files, such as ones you're just using for statistical analysis. (For example, one system around here has a relatively active nameserver that syslogs every query. You can bet that we have fsync() turned off for that logfile, and when we accidentally didn't we noticed right away.)

(From the moderate amount of poking I've done, Solaris always does this and has no option to turn it off, FreeBSD only does this for kernel messages and can turn it off, and Linux's traditional syslog daemon always does this and can turn it off. I don't know about the new syslog daemon in Fedora. OpenBSD doesn't say anything in its manpages, but appears to always fsync().)

As a side note, if you really need syslog messages to be captured, I recommend also forwarding them to a remote syslog server. That way you have a much higher chance of capturing messages like 'inconsistency detected in /var, turning it read-only' (which has happened to us), and you have a certain amount of insurance against the clock on the machine going crazy.

(A central syslog server is also a convenient place to watch all of your systems at once and easily correlate events across them.)

Comments on this page:

From at 2008-08-04 03:45:19:

Just a footnote-like comment about the central remote syslog server idea: IETF is working on a new standard for this, the main new feature being TLS transports instead of the current de facto UDP.[1] I personally look forward to this since I have always been somewhat cautious when doing remote syslog'ing in (partially) untrusted local area networks.



[1] http://tools.ietf.org/wg/syslog/

By cks at 2008-08-04 15:20:27:

I think that TLS-based transports are okay (even good) for ordinary syslog messages, but not exactly as simple as I would like for emergency messages that may be the last thing you get before a machine goes down; there's a lot of code (and state) involved in that process.

Written on 03 August 2008.
« First impressions of using DTrace on user-level programs
Our answer to the ZFS SAN failover problem »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Aug 3 23:48:23 2008
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.