2014-03-31
Why I sometimes reject patches for my own software
I recently read Drew Crawford's Conduct unbecoming of a hacker (via), which argues that you should basically always accept other people's patches for your software unless they are clearly broken. Lest we bristle at this, he gives the example of Firefox and illustrates how many patches it accepts. On the whole I sympathize with this view, and I've even had some pragmatic experience with it; a patch to mxiostat that I wasn't very enthusiastic about initially has actually become something I use routinely. But despite this there are certain sorts of patches I will reject basically out of hand. Put simply they're patches that I think will make the program worse for me, no matter how much they might help the author of the patch (or other people).
This is selfish behavior on my part, but so far all of my public software is things that I'm ultimately developing for myself first. It's nice if other people use my programs too but I don't expect any of them to get popular enough that other people's usage is going to be my major motivation for maintaining and developing them. So my priorities come first and the furthest I'm willing to go is that I'll accept patches that don't get in the way of my usage.
(Drew Crawford's article has sort of convinced me that I should be more liberal about accepting patches in general; he makes a convincing case for 'accept now, bikeshed later'. So far this is mostly a theoretical issue for my stuff.)
By the way, this would obviously be different if I was developing things with the explicit goal of having them used by other people. In that case I should (and hopefully would) suck it up and put the patch in unless I had strong indications that it would make the program worse for a bunch of people instead of just me. Maybe someday I'll write something like that, but so far it's not the case.
2014-03-05
A bit more about the various levels of IPC: whether or not they're necessary
A question you could ask about the levels of IPC that I outlined is if anything past basic process to process IPC is actually necessary (this is the level of TCP connections or Unix domain sockets). One answer to this is basically 'of course not'. All you really need is for programs to be able to talk to each other and then you can build whatever each particular system needs from there, possibly using common patterns like simple ASCII request/response protocols. A lot of software has gotten very far on this basis.
The other answer is that yes, in the end you do need all of those additional levels. To put it one way, very few people design introspectable message bus systems with standardized protocol encodings for fun. These systems get designed and built and adopted because they solve real problems; in the jargon, they are a design pattern. If you don't create them what you really get is a whole collection of ad-hoc and often partial versions that all of the various systems have reinvented on their own. For example, not having a standard protocol encoding does not free programs from needing to define a wire protocol; it just means that every program does it separately and differently and some number of them will do it badly.
(And in the modern world some number of them will make security mistakes or have buffer overruns and other flaws. A single standard system that everyone uses has the potential advantage of being carefully designed and built based on a lot of research and maybe even experience. Of course it can also be badly done, in which case everyone gets a badly done version.)
In this sense the additional levels of IPC really do wind up being necessary. It's just not the mathematical minimization sense of 'necessary' that people sometimes like to judge systems on.
2014-03-03
The multiple levels of interprocess communication
In some quarters it's popular to say that things in computer programming go in cycles and history repeats itself. I'm a bit more optimistic so I like to view it as people repeatedly facing the same problems and solving them with today's tools. One of those things that keeps coming around is IPC aka interprocess communication. One way to look at IPC is to split it up into multiple levels that seem to reoccur again and again.
The first layer of IPC is simply connecting processes to each other. Since you can't get very far without this it tends to be well solved, although people keep inventing new ways of doing it with slightly different properties. Good local IPC mechanisms provide reliable authentication information about who is connecting to you and provide some way to transport credentials, access rights, or the equivalent.
(On modern Unix systems Unix domain sockets provide at least UID information and can be used to pass file descriptors around.)
The next IPC level is a common protocol format, which gets created when people get tired of making up data formats and (re)implementing encoders and decoders. Generally the high level protocol format will specify some sort of destination (in this sense, a method) plus a structured payload (which may or may not be self documenting to some degree). Sometimes you see things like sequencing and multiplexing added at this level.
(Modern examples include Google protobufs and JSON, but there are a very large number of protocol formats that people have used over the years. One early Unix example is the Sun RPC wire format they used for NFS and a number of related things.)
The next IPC level is some kind of a 'message bus', which generally exists to solve two related registration problems. In the addressing problem, programs want to talk to whatever is implementing service X (and to have a way of registering as handling X, and stopping handling X, and so on). In the broadcast problem, a program wants to broadcast information about X to any interested parties, whoever they may be. Once you have an active message bus it can also do additional things like starting a program when a request for service X comes in.
(Message bus services on Unix date at least as far back as the venerable rpcbind, which was originally invented by Sun for NFS-related addressing issues. There is a theme here.)
The final IPC level is introspection, where you can use the IPC mechanism to find out what is registered, what operations they support, and hopefully monitor traffic and other activity. Technically IPC introspection can appear without a message bus but I think this is uncommon. The power of the introspection varies tremendously from IPC system to IPC system.
You can invent ad-hoc mechanisms to solve all of these levels of problems, and people do. What drives people to 'standardize' them is a desire to avoid reinventing the wheel for each new program and each new level of the problem. Common protocol formats almost invariably come with libraries or codec generators (or both), message busses come with standard programs and libraries (instead of ad-hoc solutions like 'Unix domain sockets with specific names in directory Z'), and so on. Introspection and tools related to it solve a bunch of common problems and can often be used to enable ad-hoc interaction with services, eg through shell scripts.
(Once you have a reasonably sophisticated message bus with a lot of things going on through it, you really want some way of finding out what messages are flowing where so you can debug the total system.)