links: Chronological entries

These are my WanderingThoughts
(About the blog)

Full index of entries
Recent comments

This is part of CSpace, and is written by ChrisSiebenmann.
Twitter: @thatcks

* * *

Categories: links, linux, programming, python, snark, solaris, spam, sysadmin, tech, unix, web

This is a DWiki.
GettingAround
(Help)

Link: A Short History Of Removable Media Behind The Iron Curtain

Pete Zaitcev's A Short History Of Removable Media Behind The Iron Curtain is a fascinating look into the history of (re)movable hard drives in the USSR. Apparently these were far more common there than they were in the west, to the point where it was routine to do this with what the west thought of as fixed hard drives. As a bonus it also includes some information on the early history of byte order independence in Linux filesystems, which Pete Zaitcev was there for.

(I know just enough about the history of computing behind the Iron Curtain to know that it was fairly different from the history in the west. My impression is that there was a fair amount of fascinating hacks and improvisations, and presumably some amount of really impressive original work.)

Link: Against DNSSEC by Thomas Ptacek

Against DNSSEC by Thomas Ptacek (@tqbf) is what it says in the title; lucid and to my mind strong reasons against using or supporting DNSSEC. I've heard some of these from @tqbf before in Tweets (and others are ambient knowledge in the right communities), but now that he's written this I don't have to try to dig those tweets out and make a coherent entry out of them.

For what it's worth, from my less informed perspective I agree with all of this. It would be nice if DNSSEC could bootstrap a system to get us out of the TLS CA racket but I've become persuaded (partly by @tqbf) that this is not viable and the cure is at least as bad as the disease. See eg this Twitter conversation.

(You may know of Thomas Ptacek from the days when he was at Matasano Security, where he was the author of such classics as If You're Typing the Letters A-E-S Into Your Code You're Doing It Wrong. See also eg his Hacker News profile.)

Update: there's a Hacker News discussion of this with additional arguments and more commentary from Thomas Ptacek here.

Link: My current dmenu changes

As I've mentioned before, I have a set of changes I've made to dmenu to make it work better for me. I have now put my current patch online as dmenu-4.5-tip.patch in case anyone is interested. I happen to like all of my changes, but then I would. See the start of the patch for a description of what it includes (and then the documentation for the new switches in the revised dmenu.1 manpage).

I expect that I'll update this patch periodically as the main dmenu source itself gets updated, but so far the latter doesn't seem to change very often.

PS: to save the energy of anyone asking: while my patch set contains a bugfix for dmenu's handling of -m (and the manpage), I don't currently feel like breaking it out as a separate patch and then trying to send it upstream. It's too much work for too little chance of success.

Update, August 4th 2015: The patchset linked above is now out of date, per here. My dmenu changes are now in my github repo for my version, split up into multiple commits that you can cherry-pick as desired.

Link: Go at Google: Language Design in the Service of Software Engineering

Go at Google: Language Design in the Service of Software Engineering is an article version of a Rob Pike keynote on, well, let me just quote:

The Go programming language was conceived in late 2007 as an answer to some of the problems we were seeing developing software infrastructure at Google. [...]

Go was designed and developed to make working in this environment more productive. [...]

The article then discusses what this means and how various aspects of Go's design were consciously shaped by a number of pragmatic software engineering issues in building large software across large(r) teams. I find it really interesting reading (and I keep referring to it and having to re-find it, so it's clearly time to put this somewhere more obvious).

Link: Getting Real About Distributed System Reliability

Jay Kreps' Getting Real About Distributed System Reliability is a very interesting discussion of the reliability of distributed systems in the real world. He patiently explains that a number of assumptions normally made to reason about this are in fact wrong in practice, especially the assumption that failures are independent. I'm not going to try to summarize his entry beyond that; go read it instead.

(I suspect that his logic extends to all real systems, not just distributed ones, and in any case he has given me a lot to think about.)

By the way, several of the links in his entry are themselves worth following and reading carefully.

(I believe I got this from my Twitter stream but I cannot find the original source now.)

Link: Filenames.WTF

In Filenames.WTF, Daniel Rutter runs down the reasons first why paying attention to file extensions is ridiculous, and then the reasons why it's still the best solution to the problem that we have. Spoiler: it's because people have spent decades creating file formats that suck.

(Via philliph on Twitter.)

Another Russ Cox regexp article: How Google Code Search Worked

Russ Cox has just added another article in his series on regular expressions; this one is titled Regular Expression Matching with a Trigram Index, or How Google Code Search Worked. It's as worthwhile as all of the previous three.

Link: Russ Cox's articles on regular expressions

If you have any interest in regular expression matching, especially efficient regexps and understanding why Perl, Python, and so on have sometimes oddly slow implementations, you really want to read Russ Cox's series of articles on regular expressions.

The core things to read are his three part series, Regular Expression Matching Can Be Simple And Fast, Regular Expression Matching: the Virtual Machine Approach, and Regular Expression Matching in the Wild.

(I know, this is late, since Hacker News discussed this a couple of years ago (plus the comment here). The gears of my link-pointing machinery evidently grind very slowly, but better late than never.)

Link: Pollution in 1.0.0.0/8

IANA has recently allocated 1.0.0.0/8 to APNIC, which has caused a certain amount of concern that it is 'polluted' by people already using it for various reasons. Pollution in 1/8 is a report from RIPE Labs on what happened when they announced routing for some bits of it as part of their debogonising work.

This is clearly going to be what they call 'interesting'.

(via Hacker News.)

Link: Using colour well in data visualization

Why Should Engineers and Scientists Be Worried About Color? is about how straightforward use of colour in data visualizations can mislead you and hide information (and how to do better). Some of their examples are eye-opening and alarming.

(Via Hacker News.)

(Since I took up photography I've had a much increased interest in how we perceive things, including colour.)

The quote of the time interval, on XML

From an article by Henri Sivonen:

Draconian error handling creates an unstable equilibrium in Game Theory terms—it only lasts until one player breaks the rule. One non-Draconian XML5 implementation in a key client product and the Draconian XML ranks would break.

(Discovered through Mark Pilgrim, specifically his firehose.)

Applications to XHTML are left as an exercise for the reader.

Link: XML on the web summarized

The link of the time interval comes from the online comic strip Bonobo Conspiracy, which neatly summarizes the reality of XML on the web in today's strip.

Link: you are what you code

From Robert Brewer comes You are what you code, which has given me something to think about. I'll quote the opening:

Hey, you. Do you realize what you're writing? The long-standing IT joke is that you always end up coding your own job out of existence. But what are you coding yourself into?

(From Planet Python, where his blog is aggregated.)

Update: I apologize to my readers for putting a link here that doesn't work without an extra, annoying step (see the comment).

Update2: the situation has now been fixed.

QOTD: There are three types of authentication

There are three types of authentication:

They are:

  1. Something you've lost,
  2. Something you've forgotten, and
  3. Something you used to be.

The full entry includes an illustrative story and bonus comments (and, unfortunately, a certain amount of comment spam, at least right now).

(From Richard Johnson of river.com.)

Link: Why the ease of installing Java matters

In Java in The Land of Make Believe, Ryan Tomayko unloads a righteous rant about why Java's license matters and what effects it has in the Linux and *BSD worlds, with great bits like:

If you want to get on the bad side of software developers and system admins, the fastest route is to waste their time.

Amen. What he said.

(The good news is that Sun GPL'ing Java may finally be changing all of this mess, which Tomayko happily acknowledges.)

(From many places, but I saw it originally on Planet Python, as Tomayko's blog is syndicated there.)

Link: Peter Gutmann on PKI

Everything you never wanted to know about PKI but were forced to find out [PDF] by Peter Gutmann is a set of slides about just that: a pile of the warts and issues with PKI in general and the SSL model in specific. If you're interested in the whole field, his home page has links to enough additional papers to keep you reading for some time.

(From Chris Samuel, and that in turn from Russell Coker.)

Link: Threads Cannot be Implemented as a Library

I've already linked to this in passing, but I'm going to rerun it as an explicit link. Threads Cannot be Implemented as a Library by Hans Boehm makes the argument in its title:

We provide specific arguments that a pure library approach, in which the compiler is designed independently of threading issues, cannot guarantee correctness of the resulting code.

There is also a discussion of this paper at Lambda the Ultimate that may be interesting reading. On a quick skim of the LtU discussion thread, this Usenet article jumps out as a useful summary of the entire volatile and multiprocessor programming issue, ending up with the conclusion that using volatile is both unnecessary and harmful in shared-state concurrent programming.

Link: OpenBSD spamd

OpenBSD spamd - greylisting and beyond is a presentation by Bob Beck of the University of Alberta about the OpenBSD's spamd system, how spammers react to it and similar systems, how you can exploit this, and the University of Alberta's experiences with spamd, complete with interesting numbers. I'm sadly jealous, as local feelings insure that I'm not going to get to deploy this sort of technology any time soon.

(Also from Richard Johnson of river.com.)

Link: varnish, a HTTP accelerator

Varnish - the http accelerator [PDF] is the slides for a presentation about a HTTP accelerator for dynamic websites/CMS systems (its website is here). Slide three made me laugh out loud, and I have to say 'what he said'.

(From Richard Johnson of river.com.)

Link: A lovely summary of the XHTML issue

On the WHATWG mailing list, Henri Sivonen put together a marvelous and concise summary of the whole problem with XHTML in today's world, and why XHTML advocates usually irritate me. I'm not going to quote anything; just read the whole thing here.

(From Sam Ruby, who linked to an Ian Hickson WHATWG mailing list message that quoted Henri Sivonen's message.)

Link: Serif vs. Sans Serif Legibility

Which Are More Legible: Serif or Sans Serif Typefaces? by Alex Poole is a review and summary of a whole pile of writing about this. I'll cut to the conclusion (don't worry, it's not a spoiler since Alex Poole puts it in the first paragraph of the introduction):

To date, no one has managed to provide a conclusive answer to this issue.

Alas, none of this helps me configure liferea to improve its legibility on a 17" CRT monitor. (I would be happy if I could just clone a Firefox setup, but this is stupidly difficult and/or impossible for no good reason.)

(From someone I chat with online, when I was asking around about this.)

Link: Golden Rules for Bad User Interfaces

Golden Rules for Bad User Interfaces is more or less what it sounds like. I could wish that the sarcasm was more biting, but that would probably be ungracious and besides, it's from SAP.

(From Greg Wilson.)

Link: On Bots

On Bots is a fascinating report on a large scale experiment to see how web search bots would explore an almost limitless set of linked pages. Whether or not the results generalize (or are still applicable), it's got a bunch of pretty pictures.

(From Tim Bray, rather belatedly.)

Link: Pumas on Hoverbikes

Pumas on Hoverbikes is about managing system administrators, and it's funny. Here's a somewhat out of context quote:

This is because managers are usually people who proved that they were handy with a chaingun and were thus rewarded by having their thumbs cut off and their weapons handed to some punk college hire.

From this you should be able to figure out if you want to read the rest. There's other articles too, like Suck Factor, which is where I first dropped into the site.

(From comments on this.)

Link: Unicode Spaces

Presented without comment, the 18 spaces of Unicode.

(From Sam Ruby.)

Link: HTML Doctype declarations inventoried

Activating the Right Layout Mode Using the Doctype Declaration is pretty much summarized by its title. As a bonus, it includes a handy chart of how various browsers react to various specific DOCTYPE declarations (or did, as of when the chart was made; browser DOCTYPE behavior is an ever-changing target, and the chart is a good illustration of how much variety there can be).

(From Anne van Kesteren.)

Link: Warning Signs for Tomorrow

Warning Signs for Tomorrow is a collection of warning signs for tomorrow. Amusing and potentially useful ones (for me; your mileage may vary) include 'lack of internet connectivity', 'ubiquitous surveillance', and 'motivation hazard'.

(From James Nicoll.)

Link: IRON File Systems

IRON File Systems [PDF] is a paper from the 2005 ACM Symposium on Operating Systems Principles. To quote from the abstract:

Commodity file systems trust disks to either work or fail completely, yet modern disks exhibit more complex failure modes. We suggest a new fail-partial failure model for disks, which incorporates realistic localized faults such as latent sector errors and block corruption. We then develop and apply a novel failure-policy fingerprinting framework, to investigate how commodity file systems react to a range of more realistic disk failures. [...]

They did their primary analysis on Linux ext3, ReiserFS 3, and (Linux) JFS; the results are comprehensive, interesting, and sometimes scary.

Link: The Single Unix Specification et al

The Open Group Base Specification Issue 6 is, well, to quote it:

This standard is the single common revision to IEEE Std 1003.1-1996, IEEE Std 1003.2-1992, and the Base Specifications of The Open Group Single UNIX Specification, Version 2.

(Those IEEE standards are better known as 'POSIX'.)

The Single Unix Specification (SUS) is a very useful authoritative reference for how various things should behave in theory. (How they behave in fact is a different issue; not everything is correctly implemented, and not everything is SUS/POSIX compliant to start with.)

You'd think that the Open Group would make this stuff openly available, but instead they want you to register and provide them with various personal details. As we can see here, that is not strictly speaking necessary; you just need the right magic URL.

(From Andree Leidenfrost via Debian Planet.)

Link: Csh Programming Considered Harmful

Tom Christiansen's Csh Programming Considered Harmful used to be posted to various Usenet comp.unix groups, nigh on ten years ago or so. I consider it sort of a pity that it isn't still being posted, because every so often someone still decides that writing shell scripts in csh would be a good idea.

Some of the problems Christiansen identified have since been fixed by modern versions of tcsh, but not all of them.

(The article can also serve as an interesting catalog of sh programming tricks.)

Link: When the "best tool for the job"... isn't.

When the "best tool for the job" isn't argues that what we might think is the best tool for the job isn't. (I'm not going to mangle its ideas by trying to summarize it more than that.)

This is an issue that I sometimes feel moderately acutely, since I use X Windows in preference to something like OS X, while the general view is that Apples are the machines for people who want both Unix and a decent user experience (and there's a certain population that questions the sanity and wisdom of Unix people who aren't interested in that migration).

(From the Voidspace Techie Blog, via Planet Python.)

Link: Ten Risks of PKI

Ten Risks of PKI: What You're Not Being Told About Public Key Infrastructure is a paper by Carl Ellison and Bruce Schneier. These aren't technical risks, at least not directly, and it makes for interesting reading. (And after you're done reading your printed copy of the PDF you can leave it out in a strategic spot for other people to run across.)

(From this comp.lang.python article by Edward Elliot, which I ran across through the Daily Python URL.)

Link: Linguistic blindness illustrated

If you can answer this, you are not paying attention is a nice illustration and discussion of the importance of thinking about what you're writing in prompts and questions in software. Also interesting is a followup entry on why everyone is so against using 'yes' and 'no' for answers in dialog boxes.

(From Daring Fireball.)

Link: non-errors in English

Non-Errors is a nice catalog of things that aren't English usage errors, even though a lot of people tend to think that they are. Since I do any number of them I find this reassuring, and it's amusing to see just how old some of these perfectly proper usages turn out to be.

(From Daring Fireball.)

Link: 'Document Centric'

Document Centric is about the disconnect between relational databases and regular users, and why people keep stuffing data into spreadsheets and the like instead of 'real' databases.

(They do. Joel Spolsky has written about the Excel team's surprise at finding this out about how real users used Excel.)

(From Carlos de la Guardia.)

Link: The virtual furniture police

The Virtual Furniture Police is ultimately an unflattering view of how IT departments too often attempt to have a great deal of control over user desktops. The opening paragraph summarizes things nicely:

This is a review, of sorts, of the book Peopleware: Productive Projects and Teams by Tom DeMarco and Timothy Lister. Then a segue to explain how typical corporate IT policies contravene some of the excellent advice in this book.

And the title is lovely; I think I have a new catchphrase.

(From a comment here.)

Link: The Unix Heritage Society

The Unix Heritage Society has a nice statement of its aims on its front page, but let me skip straight to the neat bits: complete source code for early Unix versions, such as V7 and V6. You can browse things online, or get your own personal mirror. For a long time, having this sort of thing was a Unix geek dream, and now I have my own (legal!) copy of it all.

One of the neat things I like doing with TUHS is browsing to see the original full versions of such famous Unix bits as the 'you are not expected to understand this' kernel source comment. Here it is in full, from the swtch() routine in /usr/sys/ken/slp.c in the Sixth Edition:

 /*
  * If the new process paused because it was
  * swapped out, set the stack level to the last call
  * to savu(u_ssav).  This means that the return
  * which is executed immediately after the call to aretu
  * actually returns from the last routine which did
  * the savu.
  *
  * You are not expected to understand this.
  */

While I'm in the area, I'd be remiss if I didn't link to the Wikipedia entry on Lions' Commentary on UNIX 6th Edition, with Source Code. This is a famous work for old Unix geeks, and the Wikipedia entry even has links to a PDF version.

(TUHS also has links to PDP-11 simulators and disk images, so you can actually run V7 et al. Maybe even faster than it ran on a real PDP-11/70, back in the days.)

Link: a Unix sysadmin rosetta stone

Rosetta Stone for Unix is a very handy cross-index of various commands and tasks across various Unix variants. The index of tasks is especially handy as a quick 'how do I do this on X' pointer. It's available in several formats, and as a bonus you get some helpful links as well.

(Since I was just using this today to figure out how to do various things on Solaris, I figured I should finally get around to mentioning it.)

(From a Slashdot comment.)

Link: Classic Mistakes Enumerated

Classic Mistakes Enumerated is an exerpt from the book Rapid Development by Steve McConnell; it runs through 36 familiar classic development mistakes that people make over and over again. Brooks's Law makes an appearance, of course.

(From Bill de hÓra.)

Link: Why overtime is bad for everyone

The really interesting bit of Why Crunch Mode Doesn't Work: 6 Lessons for me can be summed up in the lead-in:

There's a bottom-line reason most industries gave up crunch mode over 75 years ago: It's the single most expensive way there is to get the work done.

The article elaborates this, and makes for interesting reading. In the same area is Hours of Work in U.S. History, if one wants another set of data.

(Unfortunately I have lost where I got the first link from.)

Link: an engineering management hack

Engineering Management Hacks: The BigBook Technique is an amusing story of how a group of engineers got their management to pay attention to Brooks's Law ("Adding manpower to a late software project makes it later"). I won't spoil the punchline; read it yourself.

Around here we don't have problems with Brooks's Law, perhaps because we don't have the extra manpower to add to late projects to start with.

(From Daring Fireball.)

Link: an excerpt from On Writing Well

Here are chapters 2 through 4 of William Zinsser's On Writing Well, a classic book on, well, writing well. Just start with the opening of chapter 2 and keep going:

Clutter is the disease of American writing. We are a society strangling in unnecessary words, circular constructions, pompous frills and meaningless jargon.

Remind you of any computer manuals you've read recently? (Hopefully it does not remind you too much of WanderingThoughts. I try, but I know I have a long way to go.)

On Writing Well itself can be gotten from the online bookseller of your choice. (My choices are Canadian.)

(From a comment on a Slashdot article about writing.)

Link: Search engine page size limits for indexing

Search Engine Indexing Limits: Where Do the Bots Stop? takes an experimental approach to seeing how big a page various search engine bots will fetch, and how much of large pages they index. I find this an interesting question because it affects how you organize your content and generate indexes to it, especially for dynamic websites with auto-generated aggregate pages.

One area not investigated in the article is how far down the pages the search engine bots will go looking for links to follow. I smell a followup project for someone.

(From Ned Batchelder, who has interesting information on the size of his own blog pages as a result of this.)

Link: Readable colour text combinations

Color Test Results is a summary of the results of surveying a bunch of Internet users about what colour text on what background they found the easiest to read. The summary quote:

As you can see, the most readable color combination is black text on white background; overall, there is a stronger preference for any combination containing black.

Not coincidentally, most of my text is black on white (or black on slightly off-white). I was interested to see that white on black actually rates fairly highly, because for me bright colours on black (like white, jwz's green, etc) rapidly give me eye-searing, headache inducing afterimages.

(The survey is pretty old, but human perception is unlikely to have changed that much in ten years or so.)

(From Bill de hÓra.)

Link: Scaling Apache at ftp.heanet.ie

ftp.heanet.ie serves a huge volume of mostly static file downloads with Apache on a single machine. This results in interesting Apache tuning issues, among other things.

If this sort of thing is your cup of tea, much of Colm MacCárthaigh's blog may be interesting.

(From Tim Bray.)


Page tools: See As Blog, See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.