2012-01-24
Why I use exec in my shell scripts
As with the little example yesterday, a fair number of my shell
scripts end with running a program and when they do, I almost invariably
go the little extra distance and do it with exec. In the old days, the
reason to do this was that it used slightly less resources, since it got
rid of the shell process and left only the process for the real program
you wound up running. But, while I was around then, the reason I use it
today isn't that; it's that it lets you freely edit the script while
that final program is running.
At this point some of you may be going 'wait, what?' That's because most Bourne shell implementations are a little bit peculiar.
In most interpreted languages on Unix (like Python, Ruby, and Perl), the interpreter completely loads and parses the script file before it starts running it. This means that once your script has actually started running, once that initial load and parse has finished, you can freely change the script's file without the interpreter caring; it will only look at the actual file and its contents again if and when you re-run your script.
Bourne shell implementations have historically not worked this way (and it's possible that it's actually impossible to preparse Bourne shell scripts for some reason). Instead they not only parse the script on the fly as it executes, but also they read the file on the fly as the script runs. This means that if you edit a shell script while it's running you can literally shuffle the code around underneath the script. When the shell resumes reading and parsing the script after the current command finishes, it can be reading from partway through a line, from something that it had already read, or (if you deleted text) wind up skipping over something that it should have run. This often causes the shell script to fail with weird errors or, worse, to malfunction spectacularly. This can happen even if the shell is on the last line of the script.
But if you end a shell script with exec, you avoid this. The actual
shell interpreter effectively exits (by turning itself into the actual
program) and so there's nothing there to try to read anything more and
get confused by your edits.
(Of course nothing helps if you can't use exec; then you just have to
remember to never edit the script while it's running, at least with an
editor that overwrites the file in place.)
Sidebar: a detailed example of what happens
Let's start with a little script:
#!/bin/sh echo "a" firefox
Run this script. While Firefox is running, edit it so that the echo
string is four or five characters longer (using vi or some other
editor that overwrites files in place). When you exit Firefox, the
script will complain something like 'script: line 4: efox: command not
found'.
When the shell was running Firefox, its read position in the file was
just after the newline at the end of firefox. When you edited the
script and added more letters, that same byte position was now pointing
to the e in the 'firefox'. When Firefox exited and the shell resumed
reading from that byte position, it read 'efox<newline>', saw a
perfectly valid command execution, and tried to run 'efox' (and
failed).
(It reports that this happened on line 4 because it knew it had already read three lines, so clearly this is line 4. As a corollary, you can't trust the line numbers that are printed when something like this happens.)
2012-01-22
My view of the purpose of object orientation
A while back I read Rise and Fall of Classic OOP. This caused me to realize that I am kind of a heathen as far as object oriented programming is concerned, probably because I came to explicit OO late and never actually learned how to do it the 'right way'. You see, to me object orientation is a technique for code organization and nothing more.
This gives me a very pragmatic view of when to write OO code and when not to; I use objects and classes where they make my code simpler, and I don't use them when they don't. I don't consider them something that has to be followed at all costs or as the only way to model the real world (or any arbitrary artificial world). If the real world entities that you're working with aren't amenable to being wedged into an OO hierarchy, then don't. Given the wide variety of both code structure and ways of organizing code so that it makes sense, it would be fairly absurd to say that OO is always the right answer; it is just one technique among many. Sometimes it's the right answer, sometimes not.
(Of course, some languages as so in love with OO that they don't give you a choice about it; you can't really have freestanding functions and data containers.)
I won't say that all of those OO examples that modeled the real world always struck me as a bit hokey and artificial, because honestly I never really thought that much about it (and any small example is hokey and artificial if you really look at it). But if people are switching towards my view of the purpose of OO, I'm all for it.
(I would be shocked if this was new and novel. I sure hope that lots of people have had this thought before me, because it just feels so obvious.)
2012-01-21
The C juggernaut illustrated
Perhaps it is tempting, looking back at history from the vantage point of today, to say that C succeeded so much because it was at the right place at the right time. As you could tell the story, all sorts of people in the 1980s wanted a low level programming language, C was around, and so they seized on it. Any similar language would have done; it's just that C was lucky enough to be the one that came out on top, partly because of network effects.
(This story is especially tempting to people who don't like C and Unix.)
This significantly understates the real appeal of C at the time, even and especially to people who had alternative languages. A great illustration of this is C on the early Macintosh. You see, unlike environments like MS-DOS (which had no language associated with it, just assembler), the early Macintosh systems already had a programming language; they were designed to be programmed in Pascal (and the Mac ROMs were originally written in Pascal before being converted to assembler).
This was more than just an issue of Apple's suggested language being Pascal instead of C. The entire Mac API was designed around Pascal calling conventions and various Pascal data structures; it really was a Pascal API. Programming a Mac in C involved basically swimming upstream against this API, full of dealing with things like non-C strings (if I remember right, Mac ROM strings were one byte length plus data). I believe that Mac C compilers had to introduce a special way of declaring that a C function should have the Pascal calling convention so that it could be used as a callback function.
Despite all of this, C crushed Pascal to become by far the dominant programming language on the Macintosh. I don't think it even took all that long. Programmers didn't care that dealing with the API issues were a pain; working in C was worth it to them. It didn't matter that Pascal was the natural language to write Mac programs in or that it was a perfectly good language in its own right. C was enough better to displace Pascal in a hostile environment.
C did not win just because it was at the right place at the right time. C won in significant part because it was (and is) a genuinely good language for the job it does. As a result it was the language that a lot of pragmatic people picked if you gave them anything like a choice.