Wandering Thoughts archives

2006-01-27

Pointer: The Unix-Haters Handbook

In the spirit of yesterday's entry on a Unix annoyance:

If you're fond of Unix, and especially if you're uncritically fond of Unix, it's good to read the Unix-Haters Handbook (originally published in dead trees format, but now available online); another website on it is Don Hopkins' here, which has bonus bits and pieces.

I don't agree with the Handbook; my position is probably closest to Dennis Ritchie's marvelous anti-forward, which really summarizes things quite nicely. But it's good for Unix enthusiasts to admit that Unix does have clay on its feet.

(Besides, the Unix-Haters Handbook has lots of funny quotes, decent rants, and well done vituperation. How can you go wrong?)

I find the Handbook interesting partly because it's a lingering artifact of the clash of computing cultures in the 1980s, where Unix was ultimately victorious. Another artifact of this clash is Richard Gabriel's often-quoted The Rise of "Worse is Better", which is also well worth your time to read.

Note that "Worse is Better" was just part of Gabriel's Lisp: Good News, Bad News, How to Win Big, which is another of those seminal papers, although less interesting for non-Lisp people. Gabriel has a history of the spread of "Worse is Better" here for the interested, along with a lot of other writings on his web site.

UnixHatersHandbook written at 01:47:33; Add Comment

2006-01-15

How not to set up your DNS (part 8)

This is one of those amusingly creative mistakes to see in action:

  • microsoftglobal.com lists as nameservers ns1.one-dom4.com and ns2.one-dom4.com.
  • both respond with errors if they are sent queries that allow recursion.
  • sent queries marked non-recursive, both answer all DNS queries for the domain with no actual data, but with an 'additional authority' section that says they're the nameservers for the domain.

Nameservers normally answer a query for a domain they don't serve with a referral to a higher zone, such 'com.' or '.', the root zone. That the one-dom4.com nameservers are answering queries with referrals to themselves means that in some sense they believe they handle the domain; it's just that they don't actually have any data for it.

Returning explicit errors for recursive queries is also unusual nameserver behavior; normally, a nameserver that disallows recursion on queries effectively strips the 'recursion allowed' bit off before it processes things, so you get referrals to higher level zones.

(Mind you, judging from their WHOIS information we may not be missing much by not being able to accept email from 'microsoftpromo@microsoftglobal.com'.)

HowNotToDoDNSVIII written at 11:43:43; Add Comment

2006-01-13

An unconventional reason for large RAID stripe sizes

Here's a surprising reason to have a really large RAID stripe size: to reduce per-spindle IO loads for random IO.

How and why this works is going to take some explanation. First, modern disks read and write very fast; what costs time is seeks. Random IO means lots of seeks. In random IO on a striped RAID, this means that any time you have to touch a new disk you pay an additional seek cost. And you switch disks every time you cross a stripe boundary.

It's easy to see how large IOs benefit from larger stripe sizes. But small IOs also get hit by stripe crossings, because their start offset within stripes is random (sometimes aligned to block boundaries), and if they start too close to the end of the stripe they spill over to the next stripe. The larger the stripe size, the lower the chance that the IOs hit the unlucky jackpot.

For example, if you average 8K per IO, always aligned on 4K boundaries, and have a stripe size of 64K, you have a one in sixteen chance of a random IO starting at 60K into the stripe and spilling into the next stripe. If you jump to a 256K stripe size, this drops to one in 64.

Worse, your filesystem may not start on an aligned block in a stripe (eg, your filesystem might start 30K into the above RAID), because of partitioning overhead and so on. This raises the spillover chances and means even the smallest filesystem IO can hit two stripes.

(Per-spindle IO loads matter because a disk can only do so many random IO operations per second. Thus, the more IO operations that only hit a single disk, the more total IOPs per second you can do, either increasing the load you can handle or reducing how many disks you need for a given OS-level load.)

(Possibly this is well known in the industry, but it certainly surprised us when we hit it last year.)

WhyLargeStripeSizes written at 00:27:22; Add Comment

2006-01-12

How not to set up your DNS (part 7)

Presented in point form, because the illustrated form is too verbose:

  • The subdomain bos.netsolhost.com has nameservers NS1.bos.netsolhost.com and NS2.bos.netsolhost.com.
  • according to the nameservers for netsolhost.com, these have IP addresses 205.178.146.11 and 205.178.146.12 respectively.
  • according to 205.178.146.11, these actually have IP addresses 10.49.34.11 and 10.49.34.12 respectively.
  • 205.178.146.12 doesn't respond.

The 10.*.*.* IP addresses are RFC 1918 private addresses, so no one outside netsolhost.com can get to them. The net effect that the first query for something in bos.netsolhost.com will return useful information but everything after that fails, because when 205.178.146.11 answers your first query it also feeds you the bad nameserver IP addresses and 'poisons' your nameserver cache.

I've seen all the elements of this one separately, but this is the first time I've seen glue record hell and leaking internal domains with internal-only IP addresses combined so creatively.

We noticed this because 205.178.145.65 (allegedly 'vux16.bos.netsolhost.com') kept trying to send us email with the MAIL FROM of '627834.640381@vux16.bos.netsolhost.com'. In the process of verifying incoming mail, we want to do A and MX queries; as the first query, the A query worked, but the MX query got timeouts. When I noticed the repeated '454 temporarily unresolvable address' replies for something that was at least partially resolvable (because we accepted it as a HELO name) I started digging.

HowNotToDoDNSVII written at 11:04:00; Add Comment

On not logging things

One of the machines I help look after is an open, read-only Usenet server (details here; the open access is a 'because we can' thing). You might be surprised to know that the server logs only minimal information about NNTP sessions, and this is a deliberate action that required hacking the software a bit.

When I was setting up the machine and planning open access for everyone on campus, I found myself thinking about how I'd feel if, for example, some day a department head came to us to say 'I want to know what newsgroups people in my department are reading'. Much like library borrowing records, what newsgroups people read can be embarrassing or damaging; did I really want to be in a position of turning over that information?

We could have drawn up a strong privacy policy. But privacy policies are subject to being overridden by higher powers, such as department heads who can get the ear of the Dean. What would I do then? In the end the best way to protect this information was to not collection it, because what you don't log you can't be required to produce later. (We could always be required to introduce logging, but this would take more than semi-casual curiosity backed by political power.)

As system administrators, we often default to logging everything we can that doesn't expose obvious security risks. But this opens up more subtle abuses and risks (especially as people seem much more willing to go snooping through computer logs than other records, perhaps because computer logs are seen as 'less private').

So I'd like to urge sysadmins to consider the merits of not logging things. Consider if you really need to know that piece of information, or whether you're hoovering it up just because.

(The issue is not new with computers; librarians have been dealing with this for a long time and take it quite seriously.)

NotLoggingThings written at 02:17:05; Add Comment

2006-01-06

The old nameserver glue record hell

A recent commentator on HowNotToDoDNSI has prompted me to write about the hell that glue records could be in in the Internet's old days. To really cover the horror, I'll start with the ordinary glue record horror.

Glue records are additional A records that get returned by higher level nameservers when people ask for NS records to avoid recursion problems that would otherwise ensue when, eg, foo.com has www.foo.com as its nameserver. This can be exploited to optimize DNS lookups a bit, as the commentator mentioned (it's a neat hack). But this can also lead to 'glue record hell', where you renumber a machine but don't tell your parent nameservers about it, and people keep getting the old IP address from them. (The commentator bumped into this, and I hit it in HowNotToDoDNSVI.)

Most hard glue records these days are 'in-zone glue records', glue records for names inside the zone in question; these are the only ones where you really need the glue records. (Barring circular loops of nameservers, which are wrong anyways.)

However, back in the old days, Network Solutions was promiscuous: all nameservers were included as glue records, whether in-zone or out of zone. NetSol also had no controls on what you could list as nameservers for your domain, so anyone could list one of your machines as one of their nameservers and force it to be included as glue data in the root zones. Of course, because it was their domain, only they could change this information.

Cue interesting explosions any time you tried to renumber, rename, or even remove from service a hostname (or IP address) that was listed as someone's nameserver. Anyone. Anywhere. I don't believe NetSol's whois service had a command to show what domains were using a particular 'host record' as one of their nameservers, just to make it more challenging.

And once you had figured out what domain was holding a reference to one of your hosts, you had to go hunt down the contacts for the domain, wake them up, and persuade them to change the data. Assuming that the contact for the host record wasn't itself borked.

(Thanks go to Clay (I think) for prompting me to write this.)

Sidebar: additional glue records

Many nameservers will give you additional glue records if they happen to have appropriate reputable information lying around. Eg, look at what the .net root servers return for 'dig ns example.net'; while those nameservers are out of example.net, they are in .net, so the .net root servers have good data for their IP addresses.

Also, glue records are returned for more than just NS queries. For this entry I'm ignoring all the other cases.

TheOldGlueRecordHell written at 02:45:26; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.