2012-02-20
My view of where TCL went wrong
Recently there was both an article and the resulting Hacker News discussion on the topic of where Tcl and Tk went wrong. As it happens, I have some vaguely heretical opinions on this.
I have two and a half backgrounds with TCL and TK. First, I've used and minorly hacked some decent sized TCL and TK applications for years, primarily exmh; for the half point, I've also written some trivial TCL/TK programs. Second, a very long time ago I embedded an early version of TCL into a program I was writing and tried to push as much functionality as possible into the TCL code. The latter experience especially left me with some strong opinions about TCL.
To put it bluntly, where TCL went wrong was in core language design and especially in two areas. To start with, TCL is simply not a very nice language. Serious TCL code is simply littered with noise and extra verbosity, where fundamental programming language operations all need additional work. Let's take a representative example:
set x [expr $i/8 + 8]
This requires two extra keywords and an extra set of []'s for what most
other languages would write as 'x = i/8 + 8'. The variable names
aren't even used consistently; we set x but use $i. Even Lisp
manages to be more compact and regular. TCL's semantics may be simple
and pure and easily written down in a short amount of text, but that
doesn't mean that it's a good language; simplicity by itself is not
magic.
(I half understand why TCL syntax is the way it is, but I don't think it's a good decision. One way to put it is that TCL is what you get if you half-heartedly try to create a Lisp, or compromise a Lisp in order to theoretically get greater acceptance from ordinary programmers.)
But that's the smaller problem. The larger one is that TCL adopted a terrible data model; it decided that everything should be a string. Anything that is not actually a string, such as lists or hashes, is at least nominally encoded in a string with special quoting conventions. In early versions of TCL this was quite literal and caused endless problems with quoting; my understanding is that current versions are somewhat less literal and thus more convenient. But even with the interpreter faking things to help you out this is a terribly limited data model, even worse than Perl's.
This quoting issue was the largest single problem when I tried to use the early version of TCL. It caused me serious hearburn and in the end significantly limited what I could easily do in the language itself; my final program did a lot more in C than I really had wanted it to because of this. The experience was sufficiently bad that I swore off TCL afterwards, despite its attraction as an embeddable language.
(At the time, in the early 1990s, it was your only real option for this.)
Sidebar: my view of TK
I don't really have enough experience with TK to have firm opinions on why it didn't catch on. However, my suspicion is that it was too strongly tied to TCL to work anywhere near as gracefully from other languages. Using TK from TCL makes great use of a number of TCL features in order to make code compact and convenient; my vague experience is that writing TK code in Python is not anywhere near as fluid and easy.
I also believe that TK was mostly or entirely documented only from the TCL API for a long time, which can't have helped. Even today I don't know if ordinary TK installations come with C-level API documentation.
2012-02-16
Handling modern IPv6 in programming environments
On modern Unix systems, there are four different IPv6 environments:
- Some systems do not have IPv6 enabled at all. These systems will
reject attempts to create IPv6 sockets and then your code can stop
worrying about the rest of this.
- Some systems have dual binding
permanently disabled (the example I know of is OpenBSD). These
systems will reject attempts to turn the
IPV6_V6ONLYsocket option off. - Some systems support dual binding but have it turned off by default.
- Some systems support dual binding but have it turned on by default. I believe that this is the default state of the Linux kernel.
(I believe that both Microsoft Windows and Mac OS X also fall into one of these four categories, but I don't know for sure. Also, notice the corollary of this split; if you intend to be portable to OpenBSD, it is stupid to claim that Linux systems with dual binding off are crazy and do not need to be supported.)
You can detect whether a system allows dual binding at all by seeing if
you can turn the IPV6_ONLY socket option off on an IPv6 socket. You
can detect the default state of dual binding by creating an IPv6 socket
and then trying to connect it to an IPv4 address (or vice versa). Note
that the inability to connect a default IPv6 socket to an IPv4 address
does not mean that dual binding is impossible; it just means that it
is not the default.
(I believe that any system that defaults to dual binding on will let you
turn it off with IPV6_ONLY, ie that there are no systems that force
it to always be on. The cautious will want to write code that explicitly
checks this.)
The most reliable and portable way to use IPv6 is thus to always explicitly turn dual binding off on all of your IPv6 sockets. As covered in ModernSocketsListening, if you really do need to listen for both IPv4 and IPv6 connections you should use two separate sockets. Among other reasons, doing otherwise can uncover subtle bad assumptions in your code.
(If your code never touches IPV6_ONLY itself and runs on a system
with dual binding off, it will behave equivalently to this. However, it
will not be portable; the same code running on a system with dual binding
on will behave quite differently.)
If you're writing some sort of general networking environment or support library where people can ask you 'create a generic connection to <X>', it is my strong opinion that you want to avoid dual binding as much as possible. If people specify the address family of either the source (including a general wildcard listening address) or the destination, you should use that address family, and this covers almost all situations. I will reluctantly concede that you can rationally create a dual bound IPv6 listening socket if you are asked to listen completely generically, because such a socket is the only truly generic 'listen to any TCP (or UDP)' option. I still think it's unwise and it's clearly not completely portable; the program will either fail or only listen for IPv6 on OpenBSD.
(Thus, if portability is a strong concern you should reject such attempts as impossible, since there is no single portable socket that will accept both IPv4 and IPv6 connections. You may wish to instead default to IPv4 as a matter of pragmatics.)
If you write generic networking support code that works with IPv6
sockets and does not explicitly control IPV6_ONLY as much as
possible, you are giving your users enough rope to hang themselves.
They will get different results on different machines for both listening
sockets and directly connected sockets, depending on whether or not dual
binding is enabled on each machine. In specific, their code will behave
one way on probably the great majority of Linux systems and another way
on all OpenBSD systems and a minority of Linux systems. Since one case
is rare, it is easy to miss the unportability.
(I have circled around this issue before in GoIpv6MyDesire but not laid things out this clearly, even in my own head. I wound up thinking about this again because the Go people have been overhauling their net package a lot recently, including making it compatible with OpenBSD, and their changes had me confused about whether I still needed to patch it myself and if so, how I should be patching it.)