2014-06-30
My .screenrc
Recently, Wesley David put out a call for people to share their
.screenrc files,
even if they aren't particularly exciting. I am a regular but not
heavy user of screen and have been for years now, so I have fixed
opinions (and well-worn reflexes) but not very much cleverness.
So, here are bits of my .screenrc, annotated with comments.
The most important setting for me is the simplest:
escape ^_a
This makes the screen escape character into what I consider
something sane and sensible. Here 'sane and sensible' really means
'so uncommon that basically nothing else uses it'. Almost all control
characters are heavily used because there's only so many of them,
but ^_ is just uncommon enough that I only rarely run into
collisions.
(Back in the days of real terminals there was often somewhat of an adventure in figuring out how to generate a ^_ on a new terminal.)
Then:
bell_msg "Bell is present in window % (go look at it!)" msgwait 30
One reason I have things set this is way is xterm's ziconbeep
feature. If I run screen in an xterm, any ^G
in any screen will produce a message and thus trigger ziconbeep,
drawing my attention to that overall screen session.
Then I have some boring settings:
defscrollback 1000 startup_message off multiuser off autodetach on
(The latter two settings are theoretically the default, so mentioning them explicitly is paranoia. But I'm a sysadmin. Paranoia runs deep.)
If you're running screen in an xterm, screen normally puts its 'hardstatus' line in the xterm's titlebar. Because my xterms don't have title bars, I thus need to disable this, which I do by basically turning off hardstatus in xterms:
termcapinfo xterm* LP:hs@
My taste is to not have individual screen 'windows' be present in 'who' output and so on, for reasons that are pretty much historical at this point. So I have:
deflogin off bind U login off bind L login on
To be honest I don't think I've used those keybindings for a very long time. I might as well leave them there; after all, someday I may them.
There is one keybinding I use all the time:
bind - prev
One reason this binding is so useful and efficient for me is that it shares a physical keyboard key with ^_, making it very quick to invoke without shifting my fingers (I just lift my left fingers off the control and shift keys). I use 'space' to cycle forwards through screen windows, which is similarly easy to hit rapidly (the spacebar is a big target).
These days I run very boring things inside screen so I basically don't auto-start anything much. Most places I start out with one shell just sitting there:
screen 0
(My .screenrc also has comments about most of this, which has
been very helpful in remembering enough about why I have these
mysterious settings to be able to write this entry.)
As a side note on the great screen versus tmux debate: I don't
do anything sophisticated or fancy with screen and I already know
how to do things with it. Thus I currently see no reason to try to
switch. In general I don't dp sophisticated stuff in screen in
general; that's what multiple xterm windows are for. Screen is
for unusual situations when I need something to persist and one
instance for monitoring stuff on my office workstation when I'm at
home (and even that is increasingly historical).
2014-06-29
The tradeoffs for us in a SAN versus disk servers
In yesterday's retrospective I didn't say much about whether our overall architecture was the right approach. I think it was for the time, but before discussing that I want to cover what I see as the broad tradeoffs between having fileservers using a SAN and having fileservers host the disks themselves.
Given that we don't do failover in practice and no fileserver uses disks from more than two backends, one possible design is to just get rid of the SAN by putting the fileservers in, say, SuperMicro's 24+2 cases and having them hold all of their data disks locally. This would make the fileservers cost slightly more but would save us the cost of the backends plus the SAN networking (now that we're doing 10G-T, the latter is not tiny). We'd get rid of a single point of failure or performance problems for the entire collection of fileservers and we'd probably get somewhat better performance from having the disks locally instead of having to go over iSCSI.
What would we lose by not having a SAN, at least for things that we care about in somewhat more than abstract theory?
- Fixing fileserver hardware failures would take somewhat more work.
Today we just have to swap the fileserver's system disks into a new
chassis; in this model we'd have to swap the data disks too, all 24
of them.
- A fileserver that suffered a hard failure couldn't be worked around
remotely. Today in an emergency we could remotely force the
fileserver to power down and then (slowly) bring up its ZFS pools,
virtual fileserver IP, and so on on another fileserver, but this
relies on shared (SAN) storage.
- We'd have a hard limit on how many disks a given fileserver could
use. Today our limit is purely through choice and if we had a
strong need we could make a fileserver expand to use disks from
an additional pair of backends.
- We couldn't easily deal with something like a disk controller
failure any more. Today we can just shift away from the affected
backend without having to take the fileserver down. Without a
SAN this would be a fileserver downtime for hardware work on it.
(I'm considering things like power supply failure to be a wash. This may be wrong; if a dual power supply driving 24 disks is more likely to fail than one driving 12 or 16 disks and one driving just a basic chassis is least likely to fail, then a 24-disk fileserver is more likely to die completely than the SAN version.)
Finally, the big one:
- We'd have to rely on the fileserver's OS not just for ZFS and networking (iSCSI and NFS) but also for access to the actual disks. With a split between backends and fileservers, the backends are the only things that need to support whatever disk controllers and so on we want to use.
- Using special purpose storage hardware (such as the backends for a fileserver for all-SSD pools) requires testing and qualifying the fileserver's OS on it.
The reality of life is that Linux has pretty much the best hardware support and other OSes, less well. In an iSCSI SAN environment like ours the fileservers don't need much hardware support; they only really need networking plus a couple of system disks (and we don't care if the system disks are a bit slow). Only the backends need to support whatever hardware we need to use to get a lot of disks into a single system and to do it well and reliably in the face of wacky various issues.
(In theory a SAN is also more flexible, expandable, and upgradeable than a disk server solution. In practice we've basically never taken advantage of that, so I'm not including it. I'm focusing on our SAN as we've actually used it, not as we could have.)
When we began planning our ZFS fileservers we strongly wanted to use Solaris (and ZFS) on the fileservers and we pretty much had to use ESATA for the disks for cost reasons. ESATA support in Linux was new and novel (we had to build our own recent kernels) and I believe ESATA support in Solaris for hardware we could afford was basically not there. The actual backend servers were also relatively cheap; the cost for a backend was mostly in the external disk enclosure, which would have been needed even with the disks directly attached to a fileserver. Even if we hadn't had other reasons at the time that made us focus on a SAN design, I think we'd have wound up with our current fileserver environment for the hardware support issue alone.
The situation is somewhat more even-handed today; my impression is that OmniOS has good support for the LSI SAS controllers that we're using for disk access. However I still trust Linux's support more, partly because I suspect that it's more widely used and tested. And going with a SAN today also keeps open the possibility of, say, figuring out how to do good, usably fast failover at some point in the future. OmniOS could always speed up ZFS pool import, for example.
(With our specific new hardware design we'd also wind up wanting fileserver cases with more than 24 data disks, since our new backend hardware has 16 disk bays plus we're planning to probably put some L2ARC SSDs on the fileservers. 24-bay cases are relatively easy to get and to drive; 36-bay cases for 3.5" drives are I believe less so.)
2014-06-27
A retrospective on our overall fileserver architecture et al
My look back at how our fileserver environment has done over the past six years has focused on the Solaris fileservers (both the stuff that worked nicely and the stuff that didn't quite work out). Today I'm widening my focus to how well the whole idea and some of our overall decisions have worked out.
I think that the safest thing to say is about our overall architecture is that it's worked well for us; taking a proper retrospective look back at the constraints and so on involved in the design would require an entire entry by itself. Basing our SAN on iSCSI is still clearly the best option for low cost and wide availability of both frontends and backends. Having a SAN itself and decoupling the front end fileservers from their backend storage has been a significant advantage in practice; we've been able to do things like shuffle around entire backend machines in live production and bring up an entire second hot spare backend over Christmas vacation just in case. Effectively, having a SAN (or in our case two of them) means avoiding a single point of (long-term) failure for any particular fileserver. We like this overall architecture enough that we're replicating it in the second generation environment that we're building out now.
(To be fair, using a SAN did create a single point of performance problems.)
I don't think we've ever actually needed to have two iSCSI SAN networks instead of one, but it was fairly cheap insurance and I have no qualms about it. I believe we've taken advantage of it a few times (eg to deliberately swap one iSCSI switch around with things live, although I believe we did that during a formal downtime just in case).
Mirroring all storage between two separate iSCSI backends (at least) has been a great thing. Mirroring has given us better IO performance and has enabled us to ride out the failure of entire backend units without any significant user-visible issues. It's also made expanding people's ZFS pools much simpler, partly because it means we can do it in relatively small units. A lot of our pools are actually too small to be really viable RAID-5+ pools.
I'm convinced that using decent consumer-grade SATA disks has been a serious win. Even our 'bad' 1TB Seagate drives have given us decent service out to the end of their warranty lifetimes and beyond it, and the price savings over more costly disks is what made our entire design feasible in the first place. With unlimited budget, sure, fill up the racks with 15K RPM SAS enterprise drives (and now SSDs), but within our constraints I don't think we could have done better and our disks have undeniably worked and delivered decent performance.
Using inexpensive external disk enclosures hardware on the backends worked but has caused us moderate problems over the long run, because the disk enclosures just aren't as solid as the server hardware. They are basically PC cases, PC power supplies, and a bunch of PC fans and so on, plus some disk enclosures and wiring. We've had a number of power supply failures, by now a number of the fans have died (and can't really be replaced) with the accompanying increase in disk temperature, and so on. Having only single power supplies leaves the disk enclosures vulnerable to various power feed problems in practice. We're quite looking forward to moving to a better class of hardware in the next generation, with dual power supplies, easily replaced fans, and simply better engineering and construction.
(This means that to some extent the easy failover between backends created by using a SAN has only been necessary because our backends keep falling over due to this inexpensive hardware. We've never migrated backend storage around for other reasons.)
Using ESATA as the way to connect up all of the disks worked but, again, not without moderate problems. The largest of these is that disk resets (due to errors or just pulling a disk to replace it) are whole channel events, stalling and interrupting IO for up to four other backend disks at once. I will be much happier in the new generation where we're avoiding that. I don't think total ESATA channel bandwidth limits have been an issue on our current hardware, but that's only because an iSCSI backend only has 200 Mbytes/sec of network bandwidth. On modern hardware with dual 10G Ethernet and SATA disks that can do 150+ MBytes/sec of real disk IO this would probably be an issue.
(We are lucky that our single SSD based pool is not very big and is on SSDs for latency reasons instead of bandwidth ones.)
Our general design forced us into what I'll call 'multi-tenant' use of physical disks, where several ZFS pools can all wind up using the same physical disk. This has clearly had an impact on users, where high IO on one pool has leaked through to affect other people in other pools. At the same time we've also seen some degree of problems simply from shared fileservers and/or shared backends, even when physical disk usage doesn't overlap (and those are inevitable with our costs). I'm not sure we can really avoid multi-tenanting our disks but it is a drawback of our environment and I'll admit it.
Although I said this in my Solaris fileserver retrospective, it's worth repeating that ZFS has been a great thing for us (both for its flexible space management and for ZFS scrubs). We could have done something similar to our fileserver environment without ZFS but it wouldn't have been half as trustworthy (in my biased opinion) or half as easy to manage and deal with. I also remain convinced that we made the right choice for iSCSI backends and iSCSI target software, partly because our iSCSI target software both works and has been quite easy to manage (the latter is not something I can say about the other Linux iSCSI targets I've looked at).
As I mentioned in my entry on the Solaris stuff that didn't quite work out, effectively losing failover has been quietly painful in a low-level way. It's the one significant downside I can see in our current set of design choices; I think that ZFS is worth it, but it does ache. If we'd had it over the past six years, we probably would have made significant uses of the ability to quickly move a virtual fileserver from one physical server to another. Saying that we don't really miss it now is true only in a simple way; because we don't have it we undoubtedly haven't even been aware of situations where we'd have used it.
Having management processors and KVMs over IP for all of the fileservers and the backends has worked out well and has turned into something that I think is quite important. Our fileserver environment is crucial infrastructure; being able to look closely at its state remotely is a good and periodically important thing. We lucked into this on both our original generation hardware and on our new generation hardware (we didn't make it an explicit requirement), but as far as I'm concerned it's going to be a hard requirement for the next generation.
(Assuming that I remember this in four years or so, that is.)
PS: If you're interested in how some other aspect of our fileserver environment has worked out, please feel free to ask in comments. I'm probably missing covering interesting bits simply because I'm a fish in water when it comes to this stuff (for obvious reasons).
2014-06-18
Would I be comfortable documenting our systems in some sort of public?
One of the questions that I wind up asking myself any time I think about using wikis for sysadmin purposes is how comfortable I am having our system documentation sitting out in some sort of public view. Around here there are effectively two or three levels of public view we could have; it's easy enough to make it so that our documentation could be visible to the world, to anyone at the university, or only to people inside the department.
There are two broad problems that I see with any degree of exposure for system documentation. The first broad problem is what to do about sensitive information, ranging from stuff that would be useful for attackers through stuff like license keys and vendor support site passwords all the way to potentially personally sensitive information about particular people (which comes in many forms).
In a way the second broad problem is worse: it is the potential effects of writing with outsiders looking over your shoulder. Not everything we do as sysadmins is beautiful and elegant, to say the least. To document things in public is to open all of the things you are not as proud of to public scrutiny. I think you're inevitably going to write with this on your mind, with at least some degree of urge to self-censor, to maybe not write down some things in the open or to bend your phrasing to take the rough edges off and put a good gloss on things. I rather suspect that this is going to do undesirable things to your documentation in the long run.
(Some of these issues probably don't apply as much in a company as they do in a large university (even inside a large department). To put it one way, even a department is a pretty public place, especially once you start thinking about graduate students, postdocs, visitors, and so on.)
The upshot is that while part of me would like to open up our documentation to at least everyone in the department, the larger part of me has wound up feeling that sysadmin documentation needs to happen in private (at least around here). Writeups for public consumption are best done completely separately.
So the answer to the title of this entry is 'no, not at all'. Even if we could reliably segregate all of the sensitive information away from the public portion of the documentation, I would prefer not to have the issues that come from writing internal documentation while knowing that some degree of 'the public' may be reading over my shoulder.
2014-06-16
My view: a wiki by itself will not solve your problems
Every so often a sysadmin is in an environment with a documentation problem and they think 'I know what will fix this! We'll set up a wiki'. It is my view that that sysadmin is often wrong, especially if they've framed their problem that way.
There are certainly problems that a wiki can help with. If the problem is that people write documentation but then never update it because the update process is such a pain, maybe a wiki will help. If the problem is that all of the documentation that people write is scattered all over and impossible to find, sure, a wiki will provide a flexible, hopefully simple to use central place to put it all. But if your problem is that people don't write documentation, not even a quick email message to explain what they did or how to work the knobs and levers on their new widget, then I don't think the wiki will do very much because your real problem is likely to be somewhere deeper.
(There are many possible explanations for why people are not writing documentation, by the way. Perhaps they simply have not been given any time to do it. Perhaps everyone is effectively siloed, so they'd really be writing documentation only for themselves. Perhaps the culture around them has taught them that while in theory there is time for documentation, in practice writing documentation is not rewarded and doing other things is.)
System administration is unfortunately kind of prone to a kind of cargo cult approach to creating a good environment, where we deploy tools (software ones or procedural ones) in the hopes that waving these magic tokens around will produce a deep cultural change (whether we realize it or not). This approach is both naive and prone to failure. Sometimes new tools can change the culture, but equally new tools may do nothing to it and then you have the same cultural issues plus failed tools. The latter is probably much more common than the former.
This is not to say that new tools are pointless. New tools can certainly help to change culture. But 'help' is the important word here; the odds that new tools will change the culture all by themselves is much lower. And I suspect that it probably helps if you consciously plan to change the culture (using the tool as a lever), instead of not realizing that that's what is needed.
(This is partly theorizing since I have never been in a place where I drove a change of culture or consciously saw it happen, although in retrospect I think I did witness one change of culture as something between a bystander and a participant.)
I'm going to have to think about this in the future when I become enthused about the possibilities of new tools, or more specifically of introducing my co-workers to new tools. Am I solving a real problem we're having, or am I actually unconsciously hoping to drive some sort of magic cultural change into a world where my co-workers will like the idea of the kind of things the new tools do as much as I do?
(Of course if I do this right it will probably temper my enthusiasm for new tools, which is kind of a bummer.)