Wandering Thoughts archives

2009-08-30

Some more thinking about requirements in specifications

Aristotle Pagaltzis's comment on my previous entry prodded me into doing some more thinking about this, and his entry on RFC 2119 usage has persuaded me that there are actually four degrees of requirements that it's useful to put into specifications.

Put simply, I'd say that they are:

  • allowed: you can do this, and some people will (so be prepared to cope with it).
  • recommended: we think that you should do this.
  • should: not doing this causes problems for people and other programs, but we have to admit that they won't actually break if you don't do this.
  • must: your program won't work if you don't do this.

The IETF MAY is the first, the IETF MUST is the fourth, and I think that the IETF SHOULD sort of covers both the second and the third. I've now been convinced that there is a real difference between the second and the third cases, and I think it's important to distinguish between them in the name of honesty (if we're honest, we have a better chance of getting people to do should's because we aren't devaluing them; a should is a mark of something fairly important).

(Whether one should use the IETF SHOULD for merely recommended things is one of those debatable issues. The language of RFC 2119 appears to allow it, but there's an argument that a mere recommended practice should just be a MAY if you have to use only the three IETF levels.)

One can argue about recommended versus allowed, but pragmatically I think that it's important to differentiate between what's allowed and what's preferred. I want specifications to be able to steer implementors towards making the right choice as well as warn them about legal things that other implementations will do.

(Possibly I am out on left field on all of this, though. I don't have much experience in trying to write specifications for anyone but myself.)

RequirementLevels written at 01:13:18; Add Comment

2009-08-29

MUST versus SHOULD in your specifications

Something I've been mulling over lately is the the effects of A rule for Internet software on specifications and how strong you make requirements in them.

To put it one way: a 'MUST' in a specification makes a handy club to beat people with, but perhaps they are best reserved for things that are absolutely required in order to make the protocol work, because in practice that is what is going to happen anyways.

(Note that a MUST often does not make an effective club, because if it doesn't break things the people who got it wrong rarely care about it. This is the world of 'technical' violations of standards, as in 'technically we don't comply with ...'.)

Given that people screw things up, sooner or later someone is going to write software that disobeys every MUST in your specification (maybe not all at once, though). Some of these violations will cause their software to not work, and they will get fixed. Some of them will not, and generally those violations will then live on in the software. At this point, the only thing that those MUSTs in your specification are doing is giving people an excuse to lecture the author (and sometimes the users) of the software, and perhaps to feel smug when they do things that make it not work.

This gets stronger when you are talking about user visible behavior, because sooner or later people will disagree with you about the right thing to do (sometimes they will even be right in practice). At this point, all of these MUSTs are pointless, and should be SHOULDs all the way through.

(If you have a 'how things are presented to users' MUST that really is a MUST, for example because it's necessary to preserve security properties, you have a serious problem, because sooner or later that MUST will be violated.)

I don't particularly like this conclusion; in some ways I'm a fairly strong protocol purist (which makes me an asshole). But I think that it's inescapable in the real world, and certainly I think this is how it's played out repeatedly in practice with things like SMTP.

Note that I'm talking about, essentially, protocol specifications here, ones that involve multiple computer parties that actively interact with each other. ESMTP is a protocol specification, but HTML is not. I'm not sure what MUSTs in HTML-like specifications do in practice; possibly their only real use is making it so that people can't trivially claim compliance with your specification.

(All of this is similar to the IETF usage of MUST and SHOULD, but not quite the same. I think that the IETF would have people use MUST in places that I would wince, hold my nose, and suggest SHOULD.)

SpecMustVsShould written at 00:05:04; Add Comment

2009-08-15

SSDs and the RAID resync problem

By now, most people know that SSDs have a problem with long term use; their write speed degrades (sometimes dramatically) once enough data in total has been written to the SSD, even if the filesystem has lots of free space left (because you've written files and then deleted them). The way around this is for OSes to use the TRIM command to tell SSDs which parts of the device aren't used by the filesystem.

It recently struck me that this has interesting implications for part of the RAID resync problem. Support for TRIM isn't restricted to SSDs, so as OS support for TRIM gets more and more widespread (and better), block-level RAID devices can get better and better at knowing what blocks are actually in use and what blocks are free. This has all sorts of uses, the most obvious being that you don't have to copy free blocks when resyncing a RAID array.

(This would also help the really smart enterprise RAID systems, which do things like create sparse arrays and then only allocate space as the array gets written to. TRIM support would let them deallocate unused space from these sparse arrays and thus probably make a bunch of storage admins rather happy.)

This doesn't get 'dumb' block-level RAID quite up to the level of ZFS, but it gets them much closer.

(However, all of this probably won't happen very fast. Hardware RAID changes only slowly and hardware RAID vendors may wait to support TRIM until it's fairly widely supported in popular server OSes, so we could be looking at several years of delay.)

TrimAndRaidResync written at 01:26:04; Add Comment

2009-08-06

A rule for Internet software

Here is one of the general rules for Internet software:

You cannot count on other people not screwing up.

(You can barely count on you not screwing up.)

Any time you talk to the Internet, you have to assume that someday, the other end will do something that it isn't supposed to by the specification. People are endlessly creative, which means that sooner or later someone will get pretty much anything wrong that it's possible to get wrong; for example, they will use permanent HTTP redirects for temporary situations.

(And of course you can misread the specification too, or otherwise do something wrong. We are all morons some times.)

Internet software that counts on the other end always getting it right is not robust software. Conversely, part of writing robust Internet software is asking yourself how the other end could get something wrong, what would happen to your software, and how you can make it still do what the user would like and expect. Features should be designed with this in mind, and you may find that they require more complicated implementations than you expected; the classical example of this is parsing syndication feeds, which in theory are perfectly formed XML.

(As a direct corollary, this applies to specifications too; any 'you MUST do alarming thing X if the other end says Y' language is at least potentially dangerous. Consider the merits of using SHOULD instead, especially since smart implementors are going to ignore your MUST anyways.)

CountOnScrewups written at 01:23:48; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.