Wandering Thoughts archives

2010-08-22

Another reason to hate $LANG and locales on Unix

Sometimes I'm slow; only recently did it occur to me how the $LANG sort misfeature and GNU comm's misfeature combine in an orgy of annoyance in a heterogenous environment.

Suppose that you have systems that changed their default locale between operating system versions. As part of routine processing, you use comm to get the difference between something on the local system and a global list. Well, oops. Even if you carefully use sort on both versions, you are going to have problems.

As we saw earlier, the choice of locale may change the sort order. While GNU comm is locale aware in just the same way as sort, it is not aware of multiple locales; it assumes that all files are sorted in the current locale (and these days it actively requires it). So your global file, although sorted, may not be sorted in the current system's locale, which will cause comm both to complain and to fail.

(You get the same effect if you generate different global files on different machines and then try to process them together.)

Effectively this means that there is no such thing as a globally visible file that is properly sorted, because what 'properly sorted' is is different on different machines. Instead you probably want to sort all files on the local machine, which means making copies of the global ones. Ideally you want to do this right before using them, because the locale may differ between various environments even on a single machine; it simply safer to sort files in the script immediately before feeding them to comm, so you know that sort and comm were both running in the same locale.

(Offhand, there are at least four plausible environments where system scripts might run with a different locale: from init.d scripts at boot time, from crontab entries, from an interactive login, and from an automated ssh command invocation that passes along the other machine's locale.)

LANGHateII written at 00:17:56; Add Comment

2010-08-14

What I want in a caching nameserver

What the world needs is a good caching nameserver. What brought this on is that I am currently flirting with yet another caching nameserver, which is something that I do from time to time because every caching nameserver I've ever found sucks in its own way. This is actually somewhat surprising to me, because at one level the job is not all that difficult so you'd think that someone would have written a sane implementation by now.

(Possibly the DNS system actually is sufficiently difficult that it drives every implementer insane. Sadly I can believe it; DNS is both baroque and peculiar, and I'm sure there are lots of dark corners.)

What I want in a caching nameserver, beyond 'works', is:

  1. it can forward queries for some zone(s) off to other recursive (caching) nameservers, as recursive queries.
  2. it can send queries for some zone(s) directly to primary nameservers, as non-recursive queries.

  3. it has a sane and small configuration system. I am not interesting in anything that requires a SQL server, for example.
  4. it has a small memory footprint.

The first and second give you different ways of splicing in local zones so that you can resolve private internal names and you can still resolve things in your own organization even when your Internet link is down. I need both; sometimes I want to do a recursive query to another caching nameserver that handles all the details, and sometimes I want to talk directly to a primary nameserver that will laugh at me if I send it DNS queries that are marked as 'recursion allowed'.

DJ Bernstein's dnscache is the usual recommendation but it falls down on the first issue (and arguably on the second one as well, depending on how you interpret what it should do if it gets NSes back); it's what I normally use (because years ago I got horribly offended at Bind's memory usage). My current flirtation is with unbound, which has both recursive and non-recursive forwarding, mostly has a sane configuration system, and unfortunately falls down on memory usage even more spectacularly than Bind did.

(Looking at the package list in Fedora 13 suggests that there are a lot more potential nameservers than I thought. This list covers a lot, but the only likely candidates are MaraDNS and PowerDNS's caching server.)

CachingNameserverDesire written at 02:15:44; Add Comment

2010-08-13

PPP over ssh: solving problems with indirection

There is an old aphorism in Computer Science that any problem can be solved by another level of indirection (Wikipedia credits it to David Wheeler). Today I have an illustration of this.

I mentioned that I was having problems with USB serial ports on my just upgraded to Fedora 13 machine. Specifically, starting up PPP on such a port would hang the pppd process (and then any other process that touched the serial port). Since my DSL link is still down, this is a problem. It turns out that the specific issue is that on my machine, trying to switch to the PPP line discipline on a USB serial port hangs the process; I suspect locking issues between the kernel's TTY layer and USB layer. Reasonably, the PPP daemon switches its tty to the PPP line discipline pretty much the moment it starts, and there goes my connection attempt (and my dialup connection).

(I suspect that this happens on all SMP x86_64 Linux machines with a recent enough kernel, and possibly all SMP machines. It doesn't happen on a uniprocessor x86 machine. Interested parties can see the Fedora bug report.)

This is not a general bug with the TTY layer's handling of the PPP line discipline, or it would have been noticed well before now. In particular, you can switch to the PPP line discipline on a pseudo-tty without problems.

(I wound up testing this sort of by accident. My PPP account has a somewhat weird setup, and the most convenient way to test that it had survived the Fedora 13 upgrade was to just ssh in to it in a terminal window and see if it spewed a PPP connection initiation at me or printed errors. After the upgrade I tried this and had it work, and I thought nothing of it until later.)

So I solved my problem with indirection; I arranged to run pppd on a pty instead of on the serial port itself, transparently passing all IO back and forth between the serial port and the pty. This needed something to do this work, and the simple program for this is ssh in its transparent mode. So now my connect script doesn't directly log in to my PPP account; instead it logs in to a regular account and immediately does an 'ssh -e none pppme@localhost'.

(I could have written a program to do this without the SSH overhead, but ssh has the great virtue of already existing and working and this is an expedient hack that I sincerely hope is not living on for too long.)

Sidebar: the logical extension of this hack

Suppose that you have two machines; one machine with a dialin modem and a regular account but no PPP setup, and another machine where you can actually run PPP but you don't have a (working) dialin modem. We can solve the problem of getting a PPP link up in this situation in exactly the same way; since we're using ssh, we can perfectly well ssh off to another machine entirely instead of localhost. It may not have great latency and performance, but as the wise man once observed, working at all is better performance than not working.

(Extensions to the situation where the first machine can't directly talk to the second machine are left as an exercise for the reader.)

SshPPP written at 01:09:53; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.