Why there is a gulf between shells and scripting languages

January 2, 2011

Recently I saw a stackoverflow question on why scripting languages aren't suitable as Unix shell scripting languages. My answer is that shells are strongly optimized for a different use case than programming languages, and this has significant effects on the design of the languages that they use and their semantics. Above all, shells are optimized for invoking external programs; a successful shell has ruthlessly pruned away everything that makes this awkward. Scripting languages, like other languages, are instead generally optimized for writing expressions, statements, and other internal language features.

The most visible result of this gulf is how shells and scripting languages treat unquoted words in their input. In a scripting language, unquoted words are generally identifiers (variables) and literal text must be quoted (I'm aware that Perl is a little different here); in a shell, unquoted words are literal text and identifiers must be called out explicitly. This makes perfect sense for both sides. In shells, most input is going to be command names and arguments for them (both literal text), and in scripting languages, most input is expressions and other statements using variables, functions, and so on. Each side has optimized their syntax to make their common case easy.

Because they are focused on running commands, shells directly expose operators to manipulate the results of running commands (including a wide variety of dataflow operators, as noted in a response to the stackoverflow question). The equivalent in programming languages is their rich vocabulary for writing expressions and accessing data. Since there are only so many special characters to go around, it's quite difficult to support both sorts of operators at once with convenient syntax.

(Bash tries, but notice that it has to use a special escape sequence to get into expression writing mode. Now, imagine writing a substantial program where every expression or assignment had to be written inside a '$[[ .... ]]' stanza; you'd be very angry with the designer of that language in short order.)

You can do better than current shells for shell scripts; I outlined some ideas for this back in What makes a good Unix glue language. But I think that it is intrinsic in the gulf that a shell is going to be excessively verbose for writing programs and a scripting language is going to be excessively verbose for running commands. You can't get both at once.

Sidebar: more differences in practice

You also write different sorts of programs between the two sorts of languages. Regardless of the syntax involved (and some languages have nearly syntax-free function invocation), there is also a deep semantic difference between calling a function and running an external command; functions are far more integrated into the rest of the program than an external command can be. Even in a purely functional language they can take and return much richer data structures than you can do with an external command. The result is that shell scripting is built around dataflow between external programs, and scripting languages are built around data structure manipulations in functions.

You can do both at once if you try hard, but you have to build bridges back and forth and my opinion is that it is not really a natural way to work.


Comments on this page:

From 178.24.2.251 at 2011-01-02 12:25:22:

Explain where TCLSH fits in.

TCLs primary data format is the string, mostly works unquoted IIRC, and in TCLSH you can easily just execute "ls<ENTER>" and similar shell commands. So it seems to combine executing external commands and internal programming constructs.

By cks at 2011-01-03 11:32:49:

My gut reaction to tclsh is that it is more of a shell scripting environment than it is a scripting language because TCL requires special syntax for writing expressions and using variables. Since people have written quite large programs in TCL anyways (one of which I use practically every day), I'm not sure what this says about my arguments.

Written on 02 January 2011.
« The only way to really be secure with SSL
On improved but less functional versions of things »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Jan 2 02:12:25 2011
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.