Wandering Thoughts archives


One complexity of buffered IO on Unix

It is surprisingly challenging to get buffered IO completely correct on Unix, and one area that trips people up is correct handling of EOF. You see, there's an important different between EOF on files and EOF on terminals, as a consequence of how terminals generate and signal EOF: EOF on files is persistent, but EOF is on terminals is not.

If you read repeatedly from a file that has hit EOF, you will almost always just get another EOF. But EOF on terminals is a transient thing, so if you read again from a terminal, your code will just sit there and the user will have to type another EOF to get you to pay attention.

(The exception for file EOF is if someone else adds more data to the end of the file that you're reading.)

This means that buffering code on Unix must be careful to remember that it has seen an EOF, and not re-read from the underlying file descriptor or IO stream. You cannot use a simpler, stateless implementation; if you do, it will be irritating.

(You can provide an explicit operation to clear the EOF state if you want to. It probably won't be used very often.)

Unfortunately this is an easy (and common) mistake to make, because it's so hard to notice. Since extra reads on files, pipes, network connections and so on are harmless, everything works fine until your code or program is used to read from a terminal, and this may be quite a while.

programming/UnixEOFDifference written at 02:35:29; Add Comment

Page tools: See As Normal.
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.