Wandering Thoughts archives

2015-03-25

A significant amount of programming is done by superstition

Ben Cotton wrote in a comment here:

[...] Failure to adhere to a standard while on the surface making use of it is a bug. It's not a SySV init bug, but a bug in the particular init script. Why write the information at all if it's not going to be used, and especially if it could cause unexpected behavior? [...]

The uncomfortable answer to why this happens is that a significant amount of programming in the real world is done partly through what I'll call superstition and mythology.

In practice, very few people study the primary sources (or even authoritative secondary sources) when they're programming and then work forward from first principles; instead they find convenient references, copy and adapt code that they find lying around in various places (including the Internet), and repeat things that they've done before with whatever variations are necessary this time around. If it works, ship it. If it doesn't work, fiddle things until it does. What this creates is a body of superstition and imitation. You don't necessarily write things because they're what's necessary and minimal, or because you fully understand them; instead you write things because they're what people before you have done (including your past self) and the result works when you try it.

(Even if you learned your programming language from primary or high quality secondary sources, this deep knowledge fades over time in most people. It's easy for bits of it to get overwritten by things that are basically folk wisdom, especially because there can be little nuggets of important truth in programming folk wisdom.)

All of this is of course magnified when you're working on secondary artifacts for your program like Makefiles, install scripts, and yes, init scripts. These aren't the important focus of your work (that's the program code itself), they're just a necessary overhead to get everything to go, something you usually bang out more or less at the end of the project and probably without spending the time to do deep research on how to do them exactly right. You grab a starting point from somewhere, cut out the bits that you know don't apply to you, modify the bits you need, test it to see if it works, and then you ship it.

(If you say that you don't take this relatively fast road for Linux init scripts, I'll raise my eyebrows a lot. You've really read the LSB specification for init scripts and your distribution's distro-specific documentation? If so, you're almost certainly a Debian Developer or the equivalent specialist for other distributions.)

So in this case the answer to Ben Cotton's question is that people didn't deliberately write incorrect LSB dependency information. Instead they either copied an existing init script or thought (through superstition aka folk wisdom) that init scripts needed LSB headers that looked like this. When the results worked on a System V init system, people shipped them.

This isn't something that we like to think about as programmers, because we'd really rather believe that we're always working from scratch and only writing the completely correct stuff that really has to be there; 'cut and paste programming' is a pejorative most of the time. But the reality is that almost no one has the time to check authoritative sources every time; inevitably we wind up depending on our memory, and it's all too easy for our fallible memories to get 'contaminated' with code we've seen, folk wisdom we've heard, and so on.

(And that's the best case, without any looking around for examples that we can crib from when we're dealing with a somewhat complex area that we don't have the time to learn in depth. I don't always take code itself from examples, but I've certainly taken lots of 'this is how to do <X> with this package' structural advice from them. After all, that's what they're there for; good examples are explicitly there so you can see how things are supposed to be done. But that means bad examples or imperfectly understood ones add things that don't actually have to be there or that are subtly wrong (consider, for example, omitted error checks).)

ProgrammingViaSuperstition written at 02:16:41; Add Comment

2015-03-24

What is and isn't a bug in software

In response to my entry on how systemd is not fully SysV init compatible because it pays attention to LSB dependency comments when SysV init does not, Ben Cotton wrote in a comment:

I'd argue that "But I was depending on that bug!" is generally a poor justification for not fixing a bug.

I strongly disagree with this view at two levels.

The first level is simple: this is not a bug in the first place. Specifically, it's not an omission or a bug that System V init doesn't pay attention to LSB comments; it's how SysV init behaves and has behaved from the start. SysV init runs things in the order they are in the rcN.d directory and that is it. In a SysV init world you are perfectly entitled to put whatever you want to into your script comments, make symlinks by hand, and expect SysV init to run them in the order of your symlinks. Anything that does not do this is not fully SysV init compatible. As a direct consequence of this, people who put incorrect information into the comments of their init scripts were not 'relying on a bug' (and their init scripts did not have a bug; at most they had a mistake in the form of an inaccurate comment).

(People make lots of mistakes and inaccuracies in comments, because the comments do not matter in SysV init (very little matters in SysV init).)

The second level is both more philosophical and more pragmatic and is about backwards compatibility. In practice, what is and is not a bug is defined by what your program accepts. The more that people do something and your program accepts it, the more that thing is not a bug. It is instead 'how your program works'. This is the imperative of actually using a program, because to use a program people must conform to what the program does and does not do. It does not matter whether or not you ever intended your program to behave that way; that it behaves the way it does creates a hard reality on the ground. That you left it alone over time increases the strength of that reality.

If you go back later and say 'well, this is a bug so I'm fixing it', you must live up to a fundamental fact: you are changing the behavior of your program in a way that will hurt people. It does not matter to people why you are doing this; you can say that you are doing it because the old behavior was a mistake, because the old behavior was a bug, because the new behavior is better, because the new behavior is needed for future improvements, or whatever. People do not care. You have broken backwards compatibility and you are making people do work, possibly pointless work (for them).

To say 'well, the old behavior was a bug and you should not have counted on it and it serves you right' is robot logic, not human logic.

This robot logic is of course extremely attractive to programmers, because we like fixing what are to us bugs. But regardless of how we feel about them, these are not necessarily bugs to the people who use our programs; they are instead how the program works today. When we change that, well, we change how our programs work. We should own up to that and we should make sure that the gain from that change is worth the pain it will cause people, not hide behind the excuse of 'well, we're fixing a bug here'.

(This shows up all over. See, for example, the increasingly aggressive optimizations of C compilers that periodically break code, sometimes in very dangerous ways, and how users of those compilers react to this. 'The standard allows us to do this, your code is a bug' is an excuse loved by compiler writers and basically no one else.)

ProgramBehaviorAndBugs written at 01:46:45; Add Comment

2015-03-15

The importance of user interface, illustrated by the Go flag package

I have previously railed about Go's flag package (and why it's wrong). Recently a small but in practice really important change to how it works landed in the Go tip (what will become Go 1.5) and in the process illustrated how important small UI things are, even in command line programs, and how a modest change can cause a sea change in how I at least feel about them.

One of the quiet irritations with Go's flag package was the usage help information it gives. Here's the usage of my call program, as the current Go 1.4 shows it:

; call -h
Usage of call:
  -B=131072: the buffer size for (network) IO
  -C=0: if non-zero, only listen for this many connections then exit
  -H=false: print received datagrams as hex bytes
  -P=false: just print our known protocols
  -R=false: only receive datagrams, do not try to send stdin
  -T=false: report TLS connection information
  -b="": make the call from this local address
  -l=false: listen for connections instead of make them
  -q=false: be quieter in some situations
  -v=false: be more verbose in some situations

This is accurate but it is both ugly and not particularly like how the rest of Unix does it. All of those '-H=false' switches are flag switches, for example, but you have to know the Go flag package to know that you can use them that way and you don't have to write '-H=true' or the like, just '-H'.

The recent change changes the flag package to a much more Unixy version of this and gives you a simple way to tweak the output a bit to be more useful. Now your program, with a tiny bit of message tweaking, print something like this:

; call -h
Usage of call:
  -B bytes
        the buffer size for (network) IO, in bytes (default 131072)
  -C connections
        if non-zero, only listen for this many connections then exit
  -H    print received datagrams as hex bytes
  -P    just print our known protocols
  -R    only receive datagrams, do not try to send stdin
  -T    report TLS connection information
  -b address
        make the call from this local address
  -l    listen for connections instead of make them
  -q    be quieter in some situations
  -v    be more verbose in some situations

This actually looks like a real Unix program. The help output is not immediately ugly (and does not immediately betray that it was written in Go using the flag package). And I could make it read better for a Unix program with some more message tweaks (all of those flag descriptions should now really start with a capitalized word, for example). And it doesn't require special Go knowledge to decode how to use it; for example, it's now clear that you can use flags like -H as plain option flags without an argument.

It really amazes me how much this little change has improved my opinion of the flag package. I still don't like its differences from Unix getopt, but simply having decent help output makes it much more tolerable (and makes me much happier). Perhaps it's that I no longer feel embarrassed by the help output of my Go programs.

(I know, most people won't get this improvement until Go 1.5 comes out, and then only if they recompile programs with it. But I use the Go git tip so I get it right away, which makes me happy enough. And you'd better believe that I rebuilt everything in sight when I saw this in the change logs.)

GoFlagUIImportance written at 02:33:57; Add Comment

2015-03-04

What creates inheritance?

It all started with an @eevee tweet:

people who love inheritance: please tell me what problems it has solved for you that you don't think could be solved otherwise

My answer was code reuse, but that started me down the road of wondering what the boundaries are of what people will call 'inheritance', at least as far as code reuse goes.

These days, clearly the center of the 'inheritance' circle is straight up traditional subclassing in languages like Python. But let's take a whole series of other approaches to code and object reuse and ask if they provide inheritance.

First up, proxying. In Python it's relatively easy to build explicit proxy objects, where there is no subclass relationship but everything except some selected operations is handed off to another object's methods and you thus get to use them. I suspect that the existence of two objects makes this not inheritance in most people's eyes.

What about Go's structure embedding and interfaces? In Go you can get code reuse by embedding an anonymous instance of something inside your own struct. Any methods on defined for the anonymous instance can now be (syntactically) called on your own struct, and you can define your own methods that override some of them. With use of interfaces you can relatively transparently mix instances of your struct with instances of the original. This gets you something that's at least very like inheritance without a subclass or a proxy object in sight.

(This is almost like what I did here, but I didn't make the *bufio.Writer anonymous because there was no point.)

How about traits, especially traits that allow you to override methods in the base object? You certainly don't have a subclass relationship here, but you do have code reuse with modifications and some languages may be dynamic enough to allow the base object's methods to use methods from a trait.

So, as I wound up theorizing, maybe what creates inheritance is simply having a method resolution order, or more exactly having a need for it; this happens in a language where you can have multiple sources of methods for a single object. On the other hand this feels somewhat like a contorted definition and I don't know where people draw the line in practice. I don't even know exactly where I draw the line.

WhatCreatesInheritance written at 01:08:17; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.