What is behind Unix's 'Text file is busy' error

April 6, 2016

Perhaps you have seen this somewhat odd Unix error before:

# cp prog /usr/local/bin/prog
cp: cannot create regular file 'prog': Text file is busy

This is not just an unusual error message, it's also a rare instance of Unix being friendly and not letting you blow your foot off with a perfectly valid operation that just happens to be (highly) unwise. To understand it, let's first work out what exact operation is failing. I'll do this with strace on Linux, mostly because it's what I have handy:

$ cp /usr/bin/sleep /tmp/
$ /tmp/sleep 120 &
$ strace cp /usr/bin/sleep /tmp/
[...]
open("/usr/bin/sleep", O_RDONLY)        = 3
fstat(3, {st_mode=S_IFREG|0755, st_size=32600, ...}) = 0
open("/tmp/sleep", O_WRONLY|O_TRUNC)    = -1 ETXTBSY (Text file busy)
[...]

There we go. cp is failing when it attempts to open /tmp/sleep for writing and truncate it, which we have a running program, and the specific Unix errno value here is ETXTBSY. If you experiment some more you'll discover that we're allowed to remove /tmp/sleep if we want to, just not write to it or truncate it (at least on Linux; the specifics of what's disallowed may vary slightly on other Unixes). This is an odd limitation for Unix, because normally there's nothing that prevents one process from modifying a file out from underneath another process (even in harmful ways). Unix leaves it up to the program(s) involved to coordinate things between themselves, rather than enforcing a policy of 'no writing if there are readers' or something in the kernel.

But running processes are special, because really bad things usually happen if you modify the on-disk code of a running process. The problem is virtual memory, or more exactly paged virtual memory. On a system with paged virtual memory, programs aren't loaded into RAM all at once and then kept there; instead they're paged into RAM in bits and pieces as bits of code (and data) are needed. In fact, some times already-loaded bits and pieces are dropped from RAM in order to free up space, since they can always be loaded back in from disk.

Well, they can be loaded back in from disk if some joker hasn't gone and changed them on disk, at least. All of this paging programs into RAM in sections only works if the program's file on disk doesn't ever change while the program is running. If the kernel allowed running programs to change on disk, it could wind up loading in one page of code from version 1 of the program and another page from version 2. If you're lucky, the result would segfault. If you're unlucky, you might get silent malfunctions, data corruption, or other problems. So for once the Unix kernel does not let you blow your foot off if you really want to; instead it refuses to let you write to a program on disk if the program is running. You can truncate or overwrite any other sort of file even if programs are using it, just not things that are part of running programs. Those are special.

Given the story I've just told, you might expect ETXTBSY to have appeared in Unix in or around 3BSD, which is more or less the first version of Unix with paged virtual memory. However, this is not the case. ETXTBSY turns out to be much older than BSD Unix, going back to at least Research V5. Research Unix through V7 didn't have paged virtual memory (it only swapped entire programs in and out), but apparently the Research people decided to simplify their lives by basically locking the files for executing programs against modification.

(In fact Research Unix was stricter than modern Unixes, as it looks like you couldn't delete a program's file on disk if it was running. That section of the kernel code for unlink() gets specifically commented out no later than 3BSD, cf.)

PS: the 'text' in 'text file' here actually means 'executable code', per say size's output. Of course it's not just the actual executable code that could be dangerous if it changed out from underneath a running program, but there you go.

Sidebar: the way around this if you're updating running programs

To get around this, all you have to do is remove the old file before writing the new file into place. This (normally) doesn't cause any problems; the kernel treats the 'removed but still being used by a running program' executable the same way it treats any 'removed but still open' file. As usual the file is only actually removed when the last reference goes away, in this case the last process using the old executable exits.

(Of course NFS throws a small monkey wrench into things, sometimes in more than one way.)


Comments on this page:

By Anon at 2016-04-07 02:04:23:

Updating multi file programs while they're running is a massive pain on Linux - http://neugierig.org/software/chromium/notes/2011/08/zygote.html . Lack of reliable mandatory locking can have drawbacks...

On some level it makes sense to me that this would an exception to the usual “enough rope” attitude of Unix. Namely, this is a case where one affected program is the kernel itself, and therefore it is appropriate for the kernel to have an opinion on policy – in a way that it would not be appropriate for the kernel to dictate policy to other consenting programs amongst themselves.

Was that the actual reasoning? I have no idea. Was it deliberate? Almost certainly not. It is highly likely just a post hoc justification. But maybe it makes sense to someone besides me?

Written on 06 April 2016.
« How options in my programs conflict, and where argparse falls short
How to shoot yourself in the foot with /etc/network/interfaces on Ubuntu »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Apr 6 23:05:38 2016
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.