The problem with indentation in programming languages
You may have heard that Apple has just released a really important security update to fix SSL aka TLS certificate verification. The best description of the bug I've seen comes from Adam Langley in Apple's SSL/TLS bug. The core source for the bug is, extracted from the function:
... if ((err = SSLHashSHA1.update(&hashCtx, &serverRandom)) != 0) goto fail; if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0) goto fail; goto fail; if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) != 0) goto fail; ...
This points to the core problem with indentation. As we should all know, computer languages must communicate with two separate audiences; they're meaningful to both computers (who must execute them) and to programmers (who must read and modify them). The issue here is simple:
Indentation is semantically meaningful to people but is semantically meaningless to (most) compilers et al.
When I say that indentation is semantically meaningful to people I mean that people read things into indentation. When we're reading code we generally assume that the indentation matches the program block structure and thus that we can infer block structure from indentation level. This is a shorthand and like all shorthands it can be both clearly wrong (and thus discarded, slowing us down) or overridden if we read closely. But I think that history has shown that people can and will miss mismatches between indentation and actual program code flow and will miss bugs as a result.
(See for example Dave Jones' replies in this twitter conversation.)
I can think of two reasonably workable solutions to these; call them
the Python and the Go solutions. The Python
solution is to make indentation semantically meaningful even in the
language, thereby making the programmer view and the computer view
of the code match up. The problem is that this seems to be widely
unpopular with programmers (for no good reason). The Go solution is to
create a canonical formatting style for the language and then strongly
encourage everyone to routinely convert their code to it. If you run
code like this through gofmt
it will correct the mis-indentation and
thus give you a higher chance of spotting the problem. Make it a rule
that all code must be canonicalized through gofmt
before a change is
committed and there you go.
(I don't think that 'produce warnings about mismatching indentation' is a winning solution, although I might be wrong. At the least I don't think any of the major C compilers have added such a warning. My gut's view is that warnings are going to be less effective than other measures.)
My personal opinion is that some variant of the Python solution is
the right one. I don't object if you want to still have explicit
block delimiters for various reasons, but I think it should be a
fatal error if the indentation does not match the block structure.
Add a tool like gofmt
if you want to allow people to write sloppy
code and then have it fixed automatically before they feed it to
the real language environment.
PS: Yes, mismatching indentation is not the only problem with Apple's
code here; two goto
s in a row ought to look at least a little bit
odd regardless of their indentation (and even if they were in the
same block context, eg if they were both inside an 'if (..) { ... }
').
Comments on this page:
|
|