FastCGI's encoding mistakes and how not to design a wire protocolMarch 5, 2011
One of the things that FastCGI needs to transfer between the web server and the FastCGI server are a bunch of name/value pairs; this is used, for example, to transfer the traditional CGI environment variables for a request to the FastCGI server. FastCGI being FastCGI, it has defined a binary record for this; each name/value pair is encoded as the length of the name, the length of the value, and then the name and value as bytes (with no terminating zero). In order to reduce the overhead for short names and/or values, FastCGI allows the length of either to be encoded in either a short one byte form or a four-byte form. To tell them apart the protocol uses the high bit of the first byte (which is the high byte) of each length; if the bit is set, it is a four-byte length (and the high bit is masked out when the length is computed). Unset, and it is a one-byte length. This allows FastCGI to save up to six bytes per name/value pair, reducing the protocol overhead from eight bytes to two and, as the specification is careful to note, allows a FastCGI application to immediately allocate a correctly sized buffer for the pair. (All of this is in the FastCGI specification, section 3.4.) It is genuinely hard to count all of the terrible mistakes in this wire encoding.
Essentially, everything that FastCGI has done here is a micro-optimization that misses the forest for the trees. Not only is it optimizing things that don't matter, it is the protocol equivalent of carefully polishing your bubble sort routine when you ought to be using quicksort. If you want to optimize a wire protocol in this way, you must take a whole stack view; you must look at the flow of data through the whole decoding process (from the point where you get some bytes from the operating system on up), not just at one small portion of it, and you must look at the protocol overhead of an entire transaction (taking all layers into account). Of course, all of that is a lot less showy than coming up with a complex encoding scheme for lengths that saves six bytes. |
These are my WanderingThoughts GettingAround This is part of CSpace, and is written by ChrisSiebenmann. * * * Atom feeds are available; see the bottom of most pages. Categories: links, linux, programming, python, snark, solaris, spam, sysadmin, tech, unix, web |