The problem with filenames in IO exceptions and errors

July 6, 2014

These days a common pattern in many languages is to have errors or error exceptions be basically strings. They may not literally be strings but often the only thing people really do with them is print or otherwise report their string form. Python and Go are both examples of this pattern. In such languages it's relatively common for the standard library to helpfully embed the name of the file that you're operating on in the error message for operating system IO errors. For example, the literal text of the errors and exceptions you get for trying to open a file that you don't have access to in Go and Python are:

open /etc/shadow: permission denied
[Errno 13] Permission denied: '/etc/shadow'

This sounds like an attractive feature, but there is a problem with it: unless the standard library does it all the time and documents it, people can't count on it, and when they can't count on it you wind up with ugly error messages in practice unless people go quite out of their way.

This stems from one of the fundamental rules of good (Unix) error messages for programs, which is thou shalt always include the name of the file you had problems with. If you're writing a program and you need to produce an error message, it is ultimately your job to make sure that the filename is always there. If the standard library gives you errors that sometimes but not always include the filename, or that are not officially documented as including the filename, you have no real choice but to include the filename yourself. Then when the standard library's error or exception does include the filename, the whole error message emitted by your program winds up mentioning the filename twice:

sinksmtp: cannot open rules file /not/there: open /not/there: no such file or directory

It's tempting to say that the standard library should always include the filename in error messages (and explicitly guarantee this). Unfortunately this is very hard to do in general, at least on Unix and with a truly capable standard library. The problem is that you can be handed file descriptors from the outside world and required to turn them into standard file objects that you can do ordinary file operations on, and of course there is no (portable) way to find out the file name (if any) of these file descriptors.

(Many Unixes provide non-portable ways of doing this, sometimes brute force ones; on Linux, for example, one approach is to look at /proc/self/fd/<N>.)

Written on 06 July 2014.
« Another reason to use frameworks like Django
Goroutines versus other concurrency handling options in Go »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Jul 6 00:37:23 2014
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.