Wandering Thoughts archives

2009-03-26

An important difference

As a note to myself, there is an important difference between

; timeio 10G </dev/zero >somefile
; timeio <somefile

and what I actually intended to do for the second command,

; timeio <somefile >/dev/null

Forgetting this difference has several times left me wondering why disk IO read rates on test systems seemed to have dropped by several orders of magnitude from what they should be for no apparent reason. (Since writing null bytes to your screen is a great way to burn CPU and possibly network bandwidth while doing nothing visible at all.)

(Why I have my own command to measure bulk IO performance is a topic for another entry.)

unix/ImportantNullsDifference written at 16:56:33; Add Comment

The git version control system as a creation of the modern age

It has recently struck me that git (the distributed version control system that Linus Torvalds created) is very much a creation of the modern age. By that I don't mean that it was created recently or by modern, Internet-enabled methods; I mean that it only makes sense in the modern era of really cheap, really large disk space.

Earlier version control systems were created when disk space was much more expensive and not all that large relative to your source code. In that era, wasting disk space was a sin and a serious issue, so the version control systems spent a lot of effort (and gave up a lot of speed) in order to have efficient storage of closely related versions of the same file, using interleaved fragments (SCCS) or reverse deltas (RCS) or the like. (Then people agonized about how storing binary files made these schemes blow up, and invented binary delta formats, and so on.)

In the modern era, disks are huge relative to your source base and all of those complex compression schemes are premature optimizations and more or less pointless. So git started out by throwing the whole issue out the window and just taking the brute force approach of storing full files. Change one line? You stored another full copy of the file. In the modern age, the extra disk space didn't really matter, not compared to the simplifications that you gained from this approach. In the old era that would have been unthinkable, and the lack of 'efficient storage' would have gotten git laughed out of the room.

(I suspect that chasing efficient storage had other effects on the overall design of old version control systems. For example, is the obsession with explicitly tracking file renames partly driven by the storage inefficiency of not doing so?)

tech/ModernAgeGit written at 01:05:47; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.