Wandering Thoughts archives

2015-04-20

I don't think I'm interested in containers

Containers are all the rage in system administration right now, and I can certainly see the appeal. So it feels more than a bit heretical to admit that I'm not interested in them, ultimately because I don't think they're an easy fit for our environment.

What it comes down to is two things. The first is that I think containers really work best in a situation where the 'cattle' model of servers is a good fit. By contrast, our important machines are not cattle. With a few exceptions we have only one of each machine today, so in a container world we would just be turning those singular machines into singular containers. While there are some wins for containers I'm not convinced they're very big ones and there are certainly added complexities.

The second is that we are pretty big on using different physical machines to get fault independence. As far as we're concerned it's a feature that if physical machine X dies for whatever reason, we only lose a single service. We co-locate services only infrequently and reluctantly. This obviously eliminates one of the advantages of containers, which is that you can run multiple containers on a single piece of hardware. A world where we run a base OS plus a single container on most servers is kind of a more complicated world than we have now and it's not clear what it gets us.

I can sort of imagine a world where we become a container based environment (even with our present split of services) and I can see some advantages to it. But it's clear that it would take a lot of work to completely redo everything in our environment as a substrate of base OS servers and then a strata of ready to go containers deployed on top of them, and while we'd get some things out of such a switch I'm not convinced we'd get a lot.

(Such a switch would be more like a green field rebuild from total scratch; we'd probably want to throw away everything that we do now. This is just not feasible for us for various reasons, budget included.)

So the upshot of all of this is that while I think containers are interesting as a technical thing and I vaguely keep track of the whole area, I'm not actually interested in them and I have no plans to explore them, try them out, and so on. I feel oddly embarrassed by this for reasons beyond the comfortable scope of this entry, but there it is whether I like it or not.

(I was much more optimistic a few years ago, but back then I was just theorizing. Ever since then I've failed to find a problem around here where I thought 'yes, containers will make my life simpler here and I should advocate for them'. Even my one temptation due to annoyance was only a brief flirtation before sense set in.)

sysadmin/ContainerDisinterest written at 23:57:51; Add Comment

An interesting trick for handling line numbers in little languages

One of the moderately annoying issues you have to deal with when writing a lexer for a language is handling line numbers. Being able to report line numbers is important for passably good error messages, but actually doing this can be a pain in the rear end.

The usual straightforward way is to have your lexer keep track of the current line number and make it available to higher levels on demand. One problem this runs into is that the lexer's current position is not necessarily where the error actually is. The simple case is languages that don't allow multi-line constructs, but even here you can wind up off by a line in some situations.

A more sophisticated approach is to include the line number (and perhaps the position in the line) as part of what you return for every token. Both the parser and the lexer can then use this to report accurate positions for everything without any problems, although the lexer still has to keep counting lines and so on.

Somewhat recently I wound up writing a lexer in Go as part of a project, and I modeled it after Rob Pike's presentation on lexing in Go. Pike's lexer uses an interesting low-rent trick for handling line numbers, although it's one that's only suitable for use with a little language. Pike's lexer is given the entire source code to lex at once, so rather than explicitly tracking line numbers it just tracks the absolute character position in the source code (which it needs anyways) and includes this absolute character position as part of the tokens. If you turn out to need the line number, you call back to the lexer with the character position and the lexer counts how many newlines there are between the start of the source and the position.

Ever since I saw it this has struck me as a really clever approach if you can get away with it. Not only is it really easy to implement, but it's optimized for the common case of not needing the line number at all because you're parsing something without errors. Now that I've run into it, I'll probably reuse it in all future little language lexers.

Note that this isn't a general approach for several reasons. First, serious lexers are generally stream lexers that don't first read all of the source code into memory. Second, many languages routinely want line number information for things like profiling, debugging, and exception traces (and all of these uses are well after lexing has finished). That's why I say Pike's approach here is best for little languages, where it's common to read all of the source in at once for simplicity and you generally don't have those issues.

(If I was dealing with a 'bigger' language, I think that today I would take the approach of returning the line number as part of every token. It bulks up the returned token a bit but having the line number information directly in the token makes your life simpler in the long run, as I found out from the Go parser I wrote.)

programming/LexerLineNumbersTrick written at 00:41:52; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.