Categories: links, linux, programming, python, snark, solaris, spam, sysadmin, tech, unix, web.
|
2010-02-09 Why your program should have an actual configuration fileEvery so often, someone says something like 'you know, our program has a configuration file but also supports runtime reconfiguration via some magic. Clearly this is wrong, so what we should do is get rid of our configuration file and just make sure the running state is persistent'. If they're feeling nice, they add that the running state will be saved as an XML file. Every time people say this, sysadmins cry. Here is a very important thing for real deployments of your program in real environments: configuration files are a good thing because they are really easy to manage. Running state that is updated by applying changes (often non-idempotent changes) is much harder. First, let's get something out of the way: machine generated, automatically updated XML files are not configuration files in any conventional sense that is useful to sysadmins. They are an internal persistence mechanism that may, perhaps, have vaguely useful and inspectable contents (but generally not). So regardless of XML or not, if you go down this route you do not have a configuration file but instead a program with configuration state that persists over reboots and restarts. Let's inventory some of the things that you lose when you merely have persistent configuration state without actual configuration files:
I could go on, but I think I'm going to stop now; I hope that you get the point. Configuration files don't exist merely because those other programmers are lazy people, they exist because they're actually a pretty good solution to a whole bunch of problems at once. Getting rid of them is almost never forward progress. (2 comments.)
programming/UseConfigurationFiles written at 00:34:14; Add Comment
2010-02-08 A thought on deliberately slow disaster recoveryGiven my earlier entry, here is a thesis: some disasters are big enough that you should stop trying to recover rapidly. The problem with attempting rapid disaster recovery is that significant disasters are high stress, high pressure situations. Unless you have very good checklists, this is exactly the sort of situation where it's easy to have something go catastrophically wrong through various situations; missed steps, miscommunication between people about who was doing what, failing to notice problem indicators under the pressure of driving full speed ahead, interruptions and distractions making people lose their place, and and so on. So in this sort of situation, maybe what you should do is slow down. Back off, reduce the stress level, be methodical. Take the time to be organized. Stop sometimes to take a breather. Yes, this requires accepting that the systems will come back up slower than you might have been able to achieve if you went all out and everything went well. But in return, you are much more likely to avoid making the situation (much) worse. This is a new way of thinking about crisis handling for me, because I am quite a lot a 'go, now now now!' type of person when trying to fix problems. (And yes, some of the time I have probably made the situation worse by rushing to slap apparent bandaids on things; my instinct is to get the system up now and sort out the situation later and, well, this is not always the right answer.) There's two things that strike me about this. First, the most dangerous crises and disasters from this perspective are not necessarily the huge ones, but the ones that have the highest potential for further damage, the ones that involve your critical infrastructure but have not already done much damage to it. (To put it one way, if your machine room has burned down you have very little left to lose, no matter what you do.) Second, this is not necessarily going to be easy. There are going to be a lot of people yelling at you to get things going faster, and a lot of pressure on you in general. I suspect that you're going to want management agreement on this, in advance (because you're unlikely to get it at the time, not with people yelling at your management too). (One comment.)
sysadmin/SlowDisasterRecovery written at 01:16:07; Add Comment
2010-02-07 The problem with blog footnotesHere is something that has just occurred to me (courtesy of seeing an example of it): footnotes are hard to do well in blogs, and may need actual software support if you want them to be completely correct. The conventional way of doing footnotes in HTML is to use fragment URLs and anchors, with the footnote text at the bottom of the entry and your choice of footnote markers in the main text. But, like anything involving anchors, this means that you need to come up with unique anchor names. On one level this is no problem; you can just use 'fn:1', 'fn:2', and so on. But on another level this is a problem for blogs, because blog entries are repeatedly aggregated together with each other on web pages. When you put multiple footnote-using entries on the same HTML page, you need all of their anchors to be unique; you are not likely to get this if you use 'fn:1' style anchors. (This is especially pernicious once you start considering syndication feeds and 'planets', that put content from multiple blogs on the same HTML page.) You can just punt on the issue and say 'well, it's up to the author to come up with unique anchor text (ideally globally unique text)', but in practice people won't always do this and this is equivalent to having non-functional footnote links under some circumstances. Admittedly, I suspect that most people won't really care about all of this, and will be perfectly happy using 'fn:1' style links and having them not work. Regardless of whether the actual links work, your intent is likely to be pretty easy for users to follow. (And who knows, maybe the proper implementation of footnotes in blog entries is pop-up alt text, like xkcd famously does on the comics images. Alternately, footnotes are a printed thing that are not appropriate in HTML.) (5 comments.)
web/BlogFootnoteProblem written at 03:06:17; Add Comment
2010-02-06 Why a laptop is not likely to be my primary machine any time soonI know and read a number of people who use laptops as their primary machines, but I'm one of the people who's not interested in the idea (even ignoring any issues of relative prices). I wound up actually thinking about the question recently, and as it turns out I think I have a fairly odd set of reasons for it. So, here they are so far:
In the past, my desire for Unix (ideally Linux) would also have been a significant obstacle, but my impression is that it's now relatively easy to find a nice modern laptop that has good Linux support. (Hopefully I'm not wrong.) Another way of thinking about this is that I have two roles for computers: the computer I sit in front of all the time, and the computer that I take places for relatively moderate use. For the heavily used computer, I have strong and very particular opinions about the pieces of the computer that I interact with a lot (the keyboard, the displays), but I'm indifferent to the rest of it (provided that it's quiet). I don't care as much about the casual computer, but I want it to be small, light, and still nice for productive work. (The late Dell Mini 12 is about my platonic ideal of the casual laptop in form factor, screen resolution, and keyboard.) It's pretty clear to me that some of these desires clash even in the best of circumstances, particularly the displays; a laptop screen big enough to be one of my regular displays makes the laptop too big to be conveniently portable. Thus, if I tried to use a laptop for both roles the only use I'd get for it in the full time usage role would be as the system unit of a desktop system, as I wouldn't use either its display or its keyboard (and I'd still only have one system disk). If I absolutely had to have only computer this could be workable, but if not, there's little advantage to it. I suspect that other people are generally much less particular and picky about their keyboards, displays, software, and so on. (Or, alternately, they have found a laptop maker whose keyboards and screens they are as fond of as I am fond of my favorites.) (This entry was sparked by the discussion here. Plus, I feel like not writing about documentation for days on end.) (2 comments.)
tech/WhyNoLaptop written at 00:44:34; Add Comment
2010-02-05 Emergency procedures checklists need check stepsGiven my previous entry, here is a thesis about emergency procedure documentation: you shouldn't just have a checklist for what to do, your checklist should include actual check steps, points where you stop to explicitly confirm that you've done something and it actually works. Checklists are a good idea, but the common form of a checklist is just a list of steps to be carried out. Under the stress of an emergency situation, I don't think that this is good enough. First, your checklist implicitly assumes that everything works right, and second, it's too easy to be rushed, distracted by some interruption, sleep-deprived, or whatever while you're going through the checklist and lose track of where exactly you are, miss-do something, or miss the potentially subtle signs that something is not working the way that your checklist assumes. Thus, you need spots in your checklist where you not do things but check things; you take positive steps to make sure that everything is as it should be and that the system is in the state that you and your checklist assume that it is. These checks insure that if something goes wrong, either in the environment or in you carrying out the checklist, that it gets noticed before things go horribly off the rails and explode. In short: it's not good enough to have a checklist item that says 'throw switch 12'; you need something to confirm that you have in fact thrown switch 12 (and ideally just switch 12) and that the results of throwing switch 12 are what you expect. You need these checks to be explicit steps in your checklist for the same reason that you have a checklist in the first place; your memory is fallible, especially under stress, and having them written down explicitly maximizes the chances that you will always do this. (I suspect that one of the lessons that the airline industry can teach system administration is that in this sort of situation it is best to have two people involved, one reading off the checklist and the other one performing the actions and verbally confirming that they've been done. This makes it harder to fool yourself that something has been done or that of course something looks right.) The corollary to this corollary is that checks should especially be inserted before you about to do damaging operations such as formatting a disk, putting a replacement system online under its production IP address, or force-importing a SAN filesystem on a non-default fileserver. (Sadly, testing checks is probably even harder than testing documentation normally is; how do you manufacture failures in checklist steps to make sure that your check steps actually do anything useful?) (4 comments.)
sysadmin/ChecklistChecks written at 01:15:16; Add Comment
2010-02-03 Outdated documentation is especially risky for sysadminsThe obvious traditional risk of outdated documentation in all its forms is that you rely on it and go wrong somehow; you trust the comments in the source code and write your new code accordingly, and your changes don't work. I think that this risk is especially acute for sysadmins, for two strongly related reasons. First, much of our documentation tends to be about procedures, not simple information. Following what is actually a wrong or incomplete procedure is a great way to create spectacular failures on the spot. Worse, sysadmins inevitably wind up dealing directly with live systems and live data. (Yes, you can test procedures just as you test the code that you write, but at some point you have to use them on your live system and this is always somewhat different from the test environment, unless you have a spectacularly complete test environment.) Second, some of the least used documentation (and thus our most risky ones) is our emergency procedures. When we need to use them, we're in one of the most tense situations possible, under a great deal of pressure to get things fixed now and thus least able to go slowly and carefully and stop if something, anything, seems off. This is the exact sort of situation where incorrect procedure documentation can do the most damage, because people don't stop before they compound a small problem into a huge one. (Imagine, for example, an off by one error in documentation about how to map disk bay slots to device names. Now add a 'get things back up right away' crisis where you need to replace a disk.)
Link: Pollution in 1.0.0.0/8IANA has recently allocated 1.0.0.0/8 to APNIC, which has caused a certain amount of concern that it is 'polluted' by people already using it for various reasons. Pollution in 1/8 is a report from RIPE Labs on what happened when they announced routing for some bits of it as part of their debogonising work. This is clearly going to be what they call 'interesting'. (via Hacker News.)
How to destroy people's interest in updating documentationHere is one of the less obvious perils of outdated documentation: Suppose that you have some documentation that is out of date, but not in an obvious way; for example, you have an out of date network layout diagram. Since it's not obvious you don't realize this right away, so you keep on updating the network layout diagram when you make changes to your actual network. Except that faithfully updating an inaccurate network layout diagram is relatively pointless. When you realize that it is incorrect, you are going to have to re-check most of it anyways, or at least spend a bunch of effort to reconstruct what sections are trustworthy. This peril of outdated documentation is that updating bad documentation is wasted effort. (Fixing bad documentation is not, but that's a different thing.) Since updating documentation takes time that you could be using for other things, and it's generally not fun, it does not take too much time to be wasted this way before people stop doing updating documentation entirely. Why do annoying wasted effort, when you could be doing something that's actually productive and useful? (Especially if you did the work thinking that it wasn't wasted effort, only to find out later that what you thought was productive work, well, wasn't. People really don't like that.) At first, this effect will probably be limited to documentation that is highly suspect. But I don't think it takes much bad documentation before people more or less give up totally, because it is too heartbreaking to waste time this way and they can't stand the idea of it any more; you will lose the culture of documentation. At that point, you can stop talking about updating documentation and start talking about reconstructing it from scratch. (This is where local wikis are perhaps less than ideal, because at this stage what you really need to do is pave everything so that there is a clear line between 'done recently, can be trusted' and 'is old, do not trust until it has been redone'.)
2010-02-02 What charging credit cards doesn't proveEvery so often, commonly in the context of SSL certificates, someone puts forward the theory that charging money for things makes the customers somehow more identifiable and reliable than giving it to people for free (with the same other authentication of customers). After all, so the theory goes, when you give people something just because they have a particular email address, that's not much, but when you've charged their credit card, you have a lot more confidence in their real identity. This is wrong. To explain why it is wrong, let's talk specifically about SSL certificates. The basic model of 'verifying' SSL certificates is that in order to get a certificate for a domain, you have to prove that you (theoretically) have power over that domain; you have one of a certain number of email addresses at that domain, you can put things on its web server, or something of the like. Most SSL certificate authorities also charge money on top of this; you submit credit card information along with your Certificate Signing Request, they charge your card, and if the charge goes through you get your signed certificate in email. By collecting money from you, they've gotten a stronger verification than before. Except that they haven't, because I snuck a fast one into this description: charging a credit card is not the same as actually collecting money from it. No SSL CA waits on giving you your certificate until they actually have received your money from the credit card company; the delays involved in that would drive most customers away. Instead they issue SSL certificates very close to on the spot, which means that SSL CAs are not verifying that you can pay them money, they are verifying that they can charge a credit card. And there are a lot of ways to get a credit card number that can have some amount of money charged to it and not have that reversed, rejected, or detected as fraudulent for (say) six hours, if not days. (Oh, sure, once the charge blows up the SSL CA will try to revoke the SSL certificate. Good luck with that.) (This is kind of a reaction to this, because I think this misapprehension is a general one.)
2010-01-31 More vim options it turns out that I wantMuch to my displeasure, Ubuntu seems to have been steadily making the
version of vim that they ship more and more superintelligent. I do not
want a superintelligent vi; in fact, superintelligence
is a net negative in So far, I have wound up with:
(At some point I may look into the best way to fix the line ending issue, but I haven't been annoyed enough yet.) Some reading in the vim help files suggests that ' All in all, I really wish vim had a mode where it just settled for
being a better (5 comments.)
linux/VimOptionsII written at 23:02:28; Add Comment
Thinking about syndication feeds and spoilersDWiki has always had the ability to do the common blog thing of 'click here to see the rest of the entry'; when I put it in, I expected to use it for things like the detailed stats at the end of this entry. Because I am crazy that way, I built the feature so that it could apply on the main page (pages, really), in syndication feed entries, or both, depending on what options I turned on in any particular entry. In practice, it turned out that I really don't like using cuts in syndication feed entries, for at least two reasons. First, syndication feed readers already have good ways to skip parts of entries and even whole entries (or at least they should), which makes cutting for volume mostly unnecessary. Second, partial entries are in annoying in general because they effectively force you out of your syndication feed reader and into your browser in order to read the full entry. (In fact it turns out that I don't like cuts very much in general, so I barely use them even on the main pages.) However, this does leave one case unhandled: spoilers. Places like the anime blogging community have come up with decent Javascript-based solutions for people who are reading your main site, but this is a complete non-starter in syndication feeds. In fact you can't even count on the old 'set the colour of the text to the background colour' trick, as modern syndication feed readers can strip styling as well. My reluctant conclusion is that handling spoilers may well call for using a cut even in syndication feeds, with the annoyance of having to click off to read the entry being the lesser of two evils. The other approach is just to note that there will be spoilers at the start of an entry and count on people to use their feed reader's 'skip to next entry' feature. (Spoilers are not generally relevant to WanderingThoughts, but they sometimes come up for me elsewhere.)
|
These are my WanderingThoughts GettingAround This is part of CSpace, and is written by ChrisSiebenmann. * * * Atom feeds are available; see the bottom of most pages. Categories: links, linux, programming, python, snark, solaris, spam, sysadmin, tech, unix, web |