Wandering Thoughts archives

2016-04-13

How I'm trying to do durable disk writes here on Wandering Thoughts

A while back, Wandering Thoughts lost a just posted comment when the host went down abruptly and unexpectedly. Fortunately it was my comment and I was able to recover it, but the whole thing made me realize that I should be doing better about durable writes in DWiki. Up until that point, my code for writing comments to disk had basically been ignoring the issue; I opened a new file, I wrote things, I closed the file, I assumed it was all good. Well, no, you can't really make that assumption.

(Many normal blogs and wikis store things in a database, which worries about doing durable disk writes for you. I am crazy or perverse, so DWiki only uses the filesystem. Each comment is a little file in a particular directory.)

You can do quite a lot of worrying about disk durability on Unix if you try hard (database people have horror stories about how things go wrong). I decided not to worry that much about it, but I did want to do a reasonably competent job. The rules for this on Unix are very complicated and somewhat system dependent, so I am not going to try to cover them; I'm just going to cover my situation. I am writing new files infrequently and I'm willing for this to be a bit slower than it otherwise would be.

Omitting error checking and assuming that you have a fp file and a dirname variable already, what you want in my situation looks like this:

fp.flush()
os.fsync(fp.fileno())

dfd = os.open(dirname, os.O_RDONLY)
os.fsync(dfd)
os.close(dfd)

The first chunk of code flushes the data I've written to the file out to disk. However, for newly created files this is not enough; a filesystem is allowed to not immediately write to disk the fact that the new file is in the directory, and some will do just that. So to be sure we must poke the system so it 'flushes' the directory to disk, ie makes sure that our new file is properly recorded as being present.

(My actual code is already using raw os module writes to write the file, so I can just do 'os.fsync(fd)'. But you probably have a file object, since that's the common case.)

Some sample code you can find online will use os.O_DIRECTORY when opening the directory here. This isn't necessary but doesn't do any harm on most systems as far as I can tell; however, note that there are some uncommon Unix systems that don't even have an os.O_DIRECTORY.

(Although I haven't read through the whole thing and thus can't vouch for it, you may find Reliable file updates with Python to be useful for a much larger discussion of various file update patterns and how to make them reliable. The overall summary here is that reliable file durability on Unix is a minefield. I'm not even sure I have it right, and I try to stay on top of this stuff.)

Sidebar: Why you don't want to use os.sync()

If you're using Python 3.3+, you may be tempted to reach for the giant hammer of os.sync() here; after all, the sync() system call flushes everything to disk and you can't get more sure than that. There are two problems with doing this. First, a full sync() may take a significant length of time, up in the tens of seconds or longer. All you need is for something else to have written but not flushed a bunch of data and the disks to be busy (perhaps your system has backups running) and then boom, your sync() is delaying your program for a very long time.

Second, a full sync() may cause you to trip over unrelated disk problems. If the server you're running on has a completely different filesystem that may be having some disk IO problems, well, your sync() is going to be waiting for the overall system to flush out data to that filesystem even though you don't care about it in the least. I'm unusually sensitive to this issue because I work in an environment with a lot of NFS mounted filesystems that come from an number of different servers, and that means a lot of ways that just one or a few filesystems can be more than a little slow at the moment.

python/HowISyncDataDWiki written at 02:12:08; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.