What I want out of a Linux SSD disk cache layer

August 10, 2014

One of the suggestions in response to my SSD dilemma was a number of Linux kernel systems that are designed to add a caching layer on top of regular disks; the leading candidates here seem to be dm-cache and bcache. I looked at both of them and unfortunately I don't like either one because they don't work in the way I want.

Put simply, what I want is the ability to attach a SSD read accelerator to my filesystems or devices without changing how they are currently set up. What I had hoped for was some system where you told things 'start caching traffic from X, Y, and Z' and it would all transparently just happen; your cache would quietly attach itself to the rest of the system somehow and that would be that. Later you could say 'stop caching traffic from X', or 'stop entirely', and everything would go back to how it was before. Roughly speaking this is the traditional approach taken by local disks used to cache and accelerate NFS reads in a few systems that implemented that.

Unfortunately this isn't what dm-cache and bcache do. Both of them function as an additional, explicit layer in the Linux storage stack, and as explicit layers you don't mount, say, your filesystem from its real device, you mount it from the dm-cache or bcache version of it. Among other things, this makes moving between using a cached version and a non-cached version of your objects a somewhat hair raising exercise; for example, bcache explicitly needs to change an existing underlying filesystem. Want to totally back out from using bcache or dm-cache? You're probably going to have a headache.

(This is especially annoying because there are two cache options in Linux today and who knows which one will be better for me.)

Both dm-cache and bcache are probably okay for a large deployment where they are planned from the start. In a large deployment you will evaluate each in your scenario, determine which one you want and what sort of settings you want, and then install machines with the caching layer configured from the start. You expect to never remove your chosen caching layer; generally you'll have specifically configured your hardware fleet around the needs of the caching layer.

None of this describes the common scenario of 'I have an existing machine with a bunch of existing data, and I have enough money for a SSD. I'd like to speed up my stuff'. That is pretty much my scenario (at least to start with). I rather expect it's very much the scenario of any number of people with existing desktops.

(It's also effectively the scenario for new machines for people who do not buy their desktops in bulk. I'm not going to spec out and buy a machine configuration built around the assumption that some Linux caching layer will turn out to work great for me; among other things, it's too risky.)

PS: if I've misunderstood how dm-cache or bcache work, my apologies; I have only skimmed their documentation. Bcache at least has a kind of scary FAQ about using (or not using) it on existing filesystems.

Written on 10 August 2014.
« Intel has screwed up their DC S3500 SSDs
The problem with self-contained 'application bundle' style packaging »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Aug 10 00:47:14 2014
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.