Wandering Thoughts archives


Some thoughts on battery backup for RAID controller cards

In a comment on my entry on software RAID's advantages I was asked what I thought about sofware RAID's lack of battery backup units, as you can get on better RAID controller cards. To answer that, I'm going to start by asking my traditional question: how does having a BBU RAID card improve your system performance?

A RAID card with a battery backup unit effectively turns synchronous disk writes into asynchronous ones, by buffering such writes in its battery-backed RAM and immediately telling the host OS that the write has completed successfully. In order for this to improve performance, you have to be doing enough synchronous writes to stall your system significantly. It helps if they're relatively slow synchronous writes; the classical case is small synchronous writes to a RAID-5 array, where the small OS-level write actually turns into a couple of reads and a couple of writes.

(There is also a limit to how much a BBU can improve your performance, especially your sustained performance; at a certain write load you hit your disk's performance limits, either in write bandwidth for streaming writes or in IO operations a second for random writes. If you need to do a sustained 500 random writes per second to a single physical disk, no BBU can help you.)

In general, most write workloads are mostly asynchronous; however, there are certainly some that are highly synchronous (database operations, mail servers, anything that does a lot of file write operations over NFS, etc). These days, operating systems are very good at not forcing synchronous writes unless they feel that they really have to, because filesystem authors fully understand that synchronous writes are death to performance. (Sometimes they go overboard in this.)

In exchange for this synchronous write acceleration, you accept a number of potential drawbacks. Obviously BBU RAID cards cost more, you have to use hardware RAID to some degree, the battery backup only lasts so long (although I believe it commonly lasts for days), and the RAID controller has to lie to the host OS about the write being successful. The latter may especially be an issue if you want to use the hardware RAID controller purely for JBOD-with-BBU, and do your actual RAID in software (such as with ZFS); there you would really like the OS level to find out about write errors.

These days, there are often higher-level options than BBU hardware RAID even if you have a lot of synchronous writes. For example, it's increasingly common for filesystems to let you put their logs on very fast disk storage (either SSDs or small fast conventional disks), and this can drastically accelerate synchronous filesystem writes.

(I was going to say that you could always put a UPS on the system as a whole, but that doesn't really solve the problem of synchronous writes unless you tell the operating system to lie about them.)

(Disclaimer: this is partly me thinking through this out loud. As I don't have actual experience with BBU hardware RAID, I could be completely off base on some of this.)

Sidebar: synchronous writes and us

On our workload of aggregate general fileservice, I don't think we have any particular density of system-stalling synchronous writes. While synchronous NFS activities can stall individual filesystems (and possibly individual ZFS pools, since I'm not entirely sure how the ZFS synchronous write process works), we have enough filesystems and ZFS pools and disks that the overall fileservers will continue going along without most people noticing anything.

As such, we don't worry about BBU issues, and in fact we deliberately configure our iSCSI backends to not have write caching, despite them being on UPSes and being considered reliable black boxes.

tech/BatteryBackedRaidThoughts written at 23:18:13; Add Comment

Turning synchronous channels asynchronous

(This is likely obvious, but since I keep working it out again in my head I'm going to write it down once and for all.)

Suppose that you have a CSP-like environment, with lightweight processes and synchronous communication channels with no buffering. Synchronous channels are simple but very inconvenient for many real-world things, where you need to have asynchronous channels. Fortunately, you can turn synchronous channels into acceptable asynchronous ones as follows.

To send an asynchronous message, you spawn a new process and write the message (along with the destination channel) to it through a new, dedicated channel. This will not block (barring scheduling issues), because you know the first thing that the new process does is read your message. The sub-process then writes the message to the real destination; while it may delay a very long time as it waits for the process on the other end of the channel to pick up your message to it, that's no longer a problem for your main process.

(This assumes that you can't pass any data to the new process except via a channel. If you can, you may be able to skip writing anything to it.)

I don't think you can do a pure asynchronous receive, but if the system doesn't already provide a 'select' operation you can implement it in a similar way. You have N persistent processes, each of which reads from a specific outside channel and then writes the received message to a common channel to your main process. Your main process then just reads the common channel and gets all events as they come in (although in some random order).

If the language is limited and type-safe enough that you can't write all of the messages to a single common channel, you need a slightly more elaborate scheme. Have a number of type-specific channels, and have the sub-processes first write a pointer message to the common channel and then retransmit the actual message to the type-specific channel. Your main process picks up the pointer from the common channel and then reads from the appropriate type-specific one; you may get a bit of blocking due to scheduling, but you should never stall waiting for a message.

(You might get into this situation if you are, say, a window system receiving keyboard events, mouse events, and messages from individual windows. The respective types may be so different that you can't smash them all into one structure and pass it over one channel.)

I will note in passing that while Go has channels, it allows them to be asynchronous (and has a powerful select operation).

programming/MakingChannelsAsynchronous written at 00:05:04; Add Comment

Page tools: See As Normal.
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.