The standard format Unix error messages

April 15, 2010

As a sysadmin, I have developed strong and quite fixed opinions about how Unix programs should format their error messages for maximum useful readability by harried sysadmins. Today I feel like boiling it down to what I will grandly call the standard format Unix error message, which is the format you should use unless you have a really good reason to do otherwise.

The standard format message should be written to standard error, and looks like:

program: general problem: specific error message

A typical example would be:

mycat: cannot open 'nothing': No such file or directory

You can have multiple levels of specificness; two is just the usual case, since it fits well with the common sort of failure (as we see in the mycat example).

There's a number of slightly subtle aspects and rules of this form:

  • error messages should be printed as a single line, even if they're long. If you have to print multiple lines, put the program name on the start of every line.

    (If you are just reporting multiple problems at once, report each problem independently in the standard format.)

  • please do not be inventive with the ordering of the parts of the error message, where in the message you put the filename, and so on, and especially don't use different ordering for different messages in your program.

    (I do not want to have to pay attention to your specific error messages and read them carefully; I want to skim them, and I should always be able to read an error message from left to right and get it from more general to more specific.)

  • if the problem involves a file, always tell me what the filename is. 'Could not read configuration file' is less useful than 'Could not read configuration file /etc/somefile'. (You remember where your program's configuration file is, because you developed it. I don't.)

  • similarly, please put explicit quotes around the name of the filename so that it's clear that it's a filename and not part of your error message. (Consider the ambiguity of 'cannot open nothing', for example.)

  • I want to know exactly what went wrong, not what you think went wrong, so if you're reporting on a failure that sets errno (such as a failed system call), use strerror() (or %m in glibc's printf()), or your language's equivalent, to print the standard system error text for the errno value. As a sysadmin I will be happy if you also give me the errno number itself, because vendors keep changing the wording of the standard errors.

    (Do not give me just the errno number, or I will be very grumpy. Looking things up in tables is what computers are for.)

While it's tempting to use perror() in C programs, perror() doesn't give you enough for good messages by itself and so tempts people into excessively minimal error messages. GNU libc has error(3), which seems to have all of the features you could need for this and then some.

(I have probably missed some rules for good Unix error messages, and there is probably a better web page on this somewhere.)


Comments on this page:

From 68.87.42.115 at 2010-04-19 10:30:46:

Agree with all of your points, should add time stamps.. each line should have a timestamp on it.

By cks at 2010-04-19 12:12:42:

While timestamps may sometimes be convenient, they're not part of the standard Unix error format for things printed to standard error (as can easily be seen from how basically no programs today print them).

Written on 15 April 2010.
« The myth of a completely shared knowledge base across sysadmins
How to write to stderr so people will like you »

Page tools: View Source, View Normal.
Search:
Login: Password:

Last modified: Thu Apr 15 02:33:43 2010
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.