Wandering Thoughts archives

2012-09-28

fork() versus strict virtual memory overcommit handling

Unix's fork() is in many ways a wonderful API, but people who want strict virtual memory overcommit handling find it deeply problematic. This is because, as I've written before, fork() is probably the leading way on Unix systems to (theoretically) allocate a lot of memory that you will never use. The more allocated but unused memory you have, the stupider strict overcommit gets (cf); it increasingly denies allocations purely for accounting reasons, instead of from any danger that the system will actually run out of RAM. The corollary is that it's hard to argue that strict overcommit should be the default if systems routinely have significant amounts of allocated but unused memory.

(Why fork() is a good API is another entry. The short version is that fork() is a kernel API, not necessarily a user one.)

It's possible to argue that many instances of unused memory are bad programming practices (or mistakes), and so can at least in theory be discounted when advocating for strict overcommit. This argument is much harder to make with fork(). Straightforward use of fork() followed by exec() can be replaced by APIs like the much more complicated posix_spawn() but there are plenty of other uses of fork() that cannot be (even some uses of fork-then-exec, since posix_spawn() can't do things like change process permissions).

(In the extreme, the arguments of the strict overcommit crowd then boil down to 'well, fork() complicates our life too much so you shouldn't be allowed to use it'. This may sound harsh but it's really what it means to say that historic and natural uses of fork() are now bad practice, at least without a really good reason why.)

PS: vfork() is a hack. Really.

ForkVsOvercommit written at 01:12:04; Add Comment

2012-09-07

What the standard(s) say about the order of readdir()'s results

A while back I wrote about the pragmatic answer to what order readdir() returns results in. Inspired by a high-scoring Stackoverflow answer to a question about readdir()'s order, today's topic is what the standards have to say about this. Or at least the readily accessible Single Unix Standard, since you can find that online.

Reading Unix standards, like reading any standard, requires just as careful attention to what they don't say as to what they do say. The SuS page on readdir() contains a great deal of verbiage about how readdir() behaves, and it does say that readdir() returns an ordered sequence of all of the directory entries. However, it does not say anything about what that order is. The conclusion is straightforward; since the standard doesn't specify an ordering, it doesn't require any particular one. A standards-conformant system is allowed to return directory entries in whatever order it likes and a standards-conformant program can't assume that directory entries are in any particular order.

(I am not up on standards wonkery enough to understand what is implied by requiring readdir() to return an ordered sequence. Besides, this may be seeing too much meaning in the wording of the SuS. Reading standards is an exercise on alternately caring about the most tiny speck and passing blithely over various large things, which is one reason I don't like to do it very often.)

As it happens, this is also the pragmatic answer. Existing Unix versions and existing (different) filesystems on single Unix versions return filenames in different, unpredictable, and essentially random orders. Any program that wants to use readdir() needs to deal with this, unless it is running in very unusual circumstances.

ReaddirOrderII written at 01:51:11; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.