How big our fileserver environment is (part 1)
I've said before that I call us a midsized environment (for a number of reasons, including that we're clearly not small and also nowhere near the size of large environments). Now I feel like sharing some actual numbers on how big and not-big we are. Today's entry is about the hardware of our fileserver environment; a future one will cover data size.
(This is necessarily a snapshot in time and is likely to change in the future.)
Our fileserver environment is made up from fileservers and backends. We currently have seven fileserver machines; four productions fileservers, one hot spare, and two test fileservers. One of the test fileservers is currently inactive because it hasn't been reinstalled after we used it to build a fast OS upgrade for the production machines.
(To the extent that we have roles for the test machines, one is intended to be an exact duplicate of the production machines so that we have a test environment for reproducing problems, and the other is for things like OS upgrades or patches.)
We currently have nine backends. Six are in production, one is a hot spare, and two are for testing; which one is the hot spare backend has changed over time as failures have caused us to bring the hot spare into production and turn the old production backend into the hot spare. Currently, all backends have a full set of disks; four backends (the first four we built) use 750 GB disks and the other five use 1 TB disks, including both test backends.
(We plan to raid the test backends for spare disks at some point when our spares pool is depleted but haven't needed to so far.)
As you could partly guess from the pattern of disk sizes, our initial production deployment was three fileservers and four backends; we expanded into test machines, hot spares, and then a fourth production fileserver and its two production backends from there.
Using so much hardware for hot spares and testing is a lot less extravagant than it seems. Because of university budget issues we pretty much bought all of this hardware in two large chunks, and once we had the hardware we felt that we might as well use it for something productive instead of having it sit in storage until we needed to expand the production environment. And we still have more hardware in storage, although we've more or less run out of unclaimed 1 TB drives.
(Some environments would be space, power, or cooling constrained; we haven't run into that yet.)
Sidebar: how much hardware we need for backend testing
We need two backends because that's how production machines are set up; all ZFS vdevs are made up from mirrored pairs, each side coming from a different backend. There's a fair amount of testing where we need to be able to duplicate this mirroring.
We need enough disks to fill a single backend, because we want to be able to do a full scale load test on a single backend; this verifies things like that the software and hardware can really handle driving all of the disks at once.
We don't strictly need enough disks to fill up both test backends, although it's convenient to have that many because then you don't have to worry about shuffling disks around based on what testing you want to do.
Why visited links being visible is important for blog usability
One of the annoying things that some websites do is make it impossible to see the difference between visited links and unvisited links. This is not just irritating, I maintain that it is somewhere between a bad idea and a terrible one for real blog usability; how bad it is depends on how densely interlinked your writing is.
To summarize, for real blog usability you want to encourage people to explore your site; having landed on one page through a search or an inbound link from somewhere, you want them to keep reading more of your content. The problem with not being able to distinguish between visited and unvisited links is that it discourages exploration by causing visitors to waste their time.
If you interlink articles a lot, people are going to take an unpredictable branching path through your work. Almost inevitably there will be multiple paths that wind up at the same place (because you refer to the same article in multiple other articles); the more dense your interlinks, the more such path overlaps will exist. When visited links can't be distinguished from unvisited links, visitors can't tell whether what they're thinking about clicking on is something new (and interesting) or something that they've already read where clicking through the link will just waste their time. This inability to tell whether you're about to waste your time is a discouragement from clicking all links, and it doesn't take much discouragement before your visitors learn better. Good for them but (presumably) bad for you.
The clear conclusion is that you want visitors to be able to tell if they've read something before they click on the link, not after; you want exploration to be as risk free as possible, not randomly risky and time-wasting. Browsers give you this almost for free, provided that you do not force visited links to look just the same as unvisited ones.
(If you feel ambitious about CSS stylings, the logic here suggests actively de-emphasizing visited links compared to unvisited ones instead of just having them in a different colour.)
Sadly I have run into any number of websites with interesting, densely interlinked content that committed this mistake. The result is very frustrating; they had fascinating stuff, but they wasted too much of my time in getting to it.
(This is the same mistake that pushed me towards Reddit over Digg, way back when, and the reason was exactly this; Digg made it risky for me to click links and Reddit did not. I note that Hacker News has done Reddit one better by making visited links subdued, not just a different colour.)