Why people are attracted to minimal language cores

October 11, 2011

In the last entry I described how some languages rewrite loops with loop bodies by turning the loop body into an anonymous function that the loop invokes repeatedly. I then mentioned that some languages go all the way to turning loops into tail-recursive function calls. You might wonder why languages do this and why these sort of crazy transformations are considered attractive.

There are at least two reasons that these things are popular, for some definition of popular. First, the intellectual purity of a minimal core language appeals to a certain sort of language wonk; they tend to call the result simpler. These people are often drawn towards Lisp (especially Scheme, perhaps the canonical illustration of this philosophy in action).

(Lisp is not the only language family that has this sort of minimalism. For example, I think that Forth is just as minimal in its own way, although it gets far less language design attention.)

Second, it means that your code generator or interpreter core only needs to handle a minimal set of things because higher levels have transformed code written in the general version of the language down into this minimal form (often automatically). This has traditionally simplified optimizers; rather than implementing very similar analysis for each of a whole bunch of control flow constructs, they only have to analyze one really well. Which control flow construct you pick as your base depends on what language you're compiling; some languages pick goto, for example.

(Then you can get a PhD thesis or two out of how to do this analysis.)

My understanding is that the pragmatic evidence is mixed on whether this is a good idea or not. There have certainly been some significant successes, but I have also heard stories of compilers where the frontend carefully reduced all control flow constructions down to the single fundamental control flow atom, passed the whole thing to the optimizer, and had the optimizer reverse engineer the high level control flow stuff again from the low-level minimized control flow information.

(The argument for still doing this even when it's inefficient is that this lets the optimizer (re)discover the true high level control flows in your program, regardless of how you actually wrote the code. In a sense it's discovering what you actually meant (or at least, what you created) instead of just what you wrote.)

Written on 11 October 2011.
« Arranging scopes for for loops
The true cost of sysadmin time (actually, of anyone's time) »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Tue Oct 11 01:28:21 2011
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.