Wandering Thoughts archives


You really want to put your switches in server racks

Once upon a time, not that long ago, when we were perhaps smaller and switches were certainly more expensive, we put our switches in network racks over on one side of the machine room, all of our servers in server racks, and ran cables under our machine room's raised floor from the servers to the switches. Please learn from our painful experience and don't do that; put almost all of your switches in server racks.

Yes, really. Even if this requires putting a stack of switches in a rack to get enough ports (or enough subnets, if you sensibly use one switch per subnet). Even if you need to put one switch in the front of the rack and another in the rear, just to get enough in (switches are shallow, you can usually pull this trick off).

Why you want to do this is simple. The more network cables you run under the floor, the more you discover the charms of machine room archaeology and the more time you will spend trying to trace and pull old cables when you remove old machines. (Unless you don't have the time to pull 'harmless' unused cables, or you're going to get around to it on some slow day, or you're leaving the old cable in place for now because you're pretty sure you're going to put a new machine in the rack in a bit and you'll just be re-running a cable so let's save some work. Then it gets worse.)

Putting as many switches as necessary in your racks means that you'll run roughly one network cable per switch back to your core switch interconnect points, instead of one or more cables per server. This is a lot fewer cables under the floor (or overhead if you use overhead cable trays, and they get messy too), and that is a very good thing. It also makes it a lot easier to remove and add cables as you remove and re-add servers, which usually drastically increases the chances that you'll actually do it.

Four years ago when I wrote RackNetworking, we had just begun to think about moving from our old way to putting a bunch of switches in server racks. Since then we've almost entirely moved to the server rack approach, but we still have a number of machines that were cabled up with the old under the floor approach; every time I have to clean up after one of those machines (as I had to today), I'm reminded of how much better the new approach is.

Sidebar: our answer for uplink bandwidth

One of my concerns back in RackNetworking was uplink bandwidth from the server rack switches to the core interconnect. In practice this has not been an issue for us, because most of our machines are not heavy bandwidth consumers. We continue to run direct connections to the core interconnect switches for the few machines where we think it may actually matter; I wrote the details up in my writeup on how our network is implemented.

sysadmin/RackSwitches written at 14:57:23; Add Comment

Things that could happen to your archives

In the spirit of my old entry on things that could happen to your backups and to reinforce yesterday's entry on not trying to archive things, there's an incomplete list of things that have been known to go wrong with archives. If you're thinking of doing archives, you should be thinking about how you're going to avoid these.

  • you aren't archiving everything you need to archive.
  • the archive program doesn't work right; it writes a corrupt or incomplete archive, fails to notice or complain enough about read errors, or its archive doesn't capture a consistent and usable state of whatever you want to archive.

    With archives you should definitely be doing a full read of the archive and verifying it against the data on disk before you remove anything from disk.

    (In general archives are subject to many of the woes of backups. Take them as read.)

  • the archive media degrades over time.

    This is what most everyone talks about, and for good reason; if your data isn't there any more, nothing else matters. But it's only the tip of the iceberg for what you need and what can go wrong.

  • one or more pieces of archive media were physically damaged or destroyed due to a mishap, accident, water leak, fire, etc.

    If you care about real archives, you need more than one copy of any piece of data (and they should not be in the same place). Accidents and mishaps happen, especially to things sitting in the corner.

  • you've lost track of one or more pieces of archive media; they're stored somewhere, but you don't know specifically where any more.
  • in general you've lost track of what media you have and/or what data you've archived.
  • you've lost track of what is on each piece of archive media, so while you know you have an archival copy of <X> you don't know which one of fifty tapes it's on (and no one is going to go search through all fifty tapes unless it is really, really important).

  • you don't have anything that can read the media any more.
  • the media reading hardware that you carefully saved has quietly stopped working sometime during the years that it was in storage.
  • you can't connect the media reading hardware to any of your current systems; it requires an obsolete interface that is no longer supported.
  • you have an interface card for the obsolete interface you need, but it uses a bus type that is no longer supported on your machines.

    (I have some PCI SCSI cards. The odds that I will be able to put them in machines drops by the day.)

  • you have all of the hardware you need and you even saved cables too, but the OS driver for the hardware was removed several years ago after it became unmaintained because no kernel hacker had a copy of the hardware to test with any more.

  • all of your hardware works for the first N tapes (or disks, or whatever), then something breaks due to the amount of wear you're putting on old hardware. Since it's all obsolete hardware, there's no longer any spare parts, maintenance and cleaning kits, or the expertise to use any of these even if you had them.

  • you didn't write down what format the archives are in because it was obvious at the time.
  • you don't have any software that can read the archive format.
  • the details of the archive format either were never documented or were only documented in ancient documentation that you got rid of years ago. You earn bonus irony points if you carefully included the documentation in your archives.

  • the software you have that can read the archive format doesn't run on any of your current machines.
  • the old OS you need to run the software to read the archive format doesn't work on any of your current machines.
  • you have source code for software to read the archive format, but it doesn't compile on the current version of the OS because the compiler has gotten stricter, the library interfaces have changed, and the OS has moved from 32-bit to 64-bit.

  • your commercial archiving system requires a license key, but the company that made it is out of business now and certainly not issuing any new ones. Your old license key expired five years ago.

    (Yes, there are people who do long term archiving with commercial software.)

  • you have forgotten all of the details about how to work with the media, the archive format, and any surviving software. In theory you could with sufficient effort re-master all of the pieces and reverse engineer the format and extract the data. In practice you don't have the time to do all of this (because it is not a high enough of a priority), and so the archives are unreadable and will never be extracted.

    It's common to discover this shortly before your last media reader is decommissioned, because this is when everyone decides that you should move the data from the old media (and format) on to some new media. This is often the first time anyone has thought about the archives for years.

    (Even if you can remember all of this, it not infrequently turns out that you simply don't have enough time to cycle all of your old media through to read all of the data off of it.)

There are probably many more, but I have less painful experience with archives than I do with backups.

(Although we had an interesting time when the last 9-track reel to reel tape drive was being taken out of service. I don't think we got all of the old historical 9-track tapes copied that we wanted to.)

sysadmin/PotentialArchiveProblems written at 00:28:45; Add Comment

Page tools: See As Normal.
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.