In practice, programmers mostly understand complexity by superstition

April 17, 2015

A while back I said that a significant amount of programming is done by superstition. A corollary of that is that a great deal about programming is also understood primarily through superstition and mythology, as opposed to going to the actual deep technical detail and academic definitions of things. Today I want to point specifically to complexity.

You are a good, well educated programmer, so you know what 'constant time' or O(1) in the context of hash tables really means (for example). You probably know many of the effects that can distort hash table operations from being fast; there's that 'constant time' really only refers to 'relative to the number of entries', there's pathological cases that violate this (like extensive hash collisions), there's the practical importance of constant factors, there's the time taken by other necessary operations (which may not be constant time themselves), and then there's the low-level effects of real hardware (RAM fetches, TLB fills, L2 cache misses, and so on). This is an incomplete list, because everything is complex when you dig in.

Most programmers either don't know about this or don't think about it very much. If something is called 'constant time', they will generally expect it to be fast and to be consistently so under most conditions (certainly normal conditions). Similar things hold for other broad complexity classes, like linear or n log n. In fact you're probably doing well if a typical programmer can even remember what complexity class things are in. Generally this doesn't matter; as before, the important thing is the end result. If you're writing code with a significantly wrong understanding of something's practical behavior and thus time profile, you're probably going to notice.

(Not always, though; see Accidentally Quadratic, especially the rationale. You can choose to see this as a broad problem with the mismatch between superstition and actual code behavior here, in that people are throwing away performance without realizing it.)

Written on 17 April 2015.
« Are Python dictionaries necessarily constant-time data structures?
A core problem of IPv6 adoption is the lack of user benefits »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Apr 17 02:02:54 2015
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.