2007-09-14
A thought on untyped languages
In thinking about the whole subject of how languages deal with types recently, it struck me that calling something an untyped language is usually somewhat of a misnomer.
Most things that get called untyped languages, such as assembly code, actually have types; it's just that they are attached to operations instead of to values (and variables, and storage locations). Arguably assembly language has stronger typing for arithmetic operations than most high level languages, since there isn't any silent conversion between different types of numbers.
Thinking about it, I suspect that languages moved to attaching type information to variables because you have to tell the system each variable's type so that it can automatically give you enough storage space for each of them. Once you do that, repeating the information by using a typed operation is just redundant, and programmers are good at getting rid of redundancy.
I believe that there are a few genuinely untyped languages, where all values have a single underlying representation and you can perform any operation on them, although some operations may not make much sense for some data values. I think that APL is close to this, and you could do an APL-like language where all values were really numeric arrays of some number of dimensions.
Another sort of untyped language is one where there are types but
everything gets wildly converted back and forth and if something doesn't
work out you just get a null value. I thought that awk might be
untyped this way, but it is possible to get it to complain about a type
mismatch under some circumstances.
2007-09-07
My view of what 'strongly typed' means
I was recently reading this slide from a presentation, which set me to thinking about the whole issue of what makes me consider something to be 'strongly typed'.
In high-level languages, I have a pragmatic definition: if 2 + "3"
succeeds, especially if it is 5, your language is weakly typed. Thus,
awk and Perl are weakly typed but Python and Ruby are strongly typed.
(I have to restrict this to high-level languages because this works in C, although it does not give you 5. If you are lucky this particular case gives you a core dump.)
More generally, I tend to think of strong typing as having two
attributes: type conversions among core types must be explicit, and they
are not guaranteed to be possible (they can fail for some types or some
values). Things like 2 + 3.5 are excused because I tend to lump all
forms of numbers together in my mind as one big abstract Numbers type;
it's acceptable for the language to implicitly convert back and forth
among the subtypes.
(I have to restrict this to 'core types' because languages generally give user-written types some hooks to do automatic conversions, so in a nominally strongly typed language you can create what are effectively weakly typed types, things that will automatically 'convert' themselves to other types without any warnings.)
By contrast, weakly typed languages either perform implicit conversions or allow you to ultimately force a conversion to happen, and sometimes both. Perl does the former; C does the latter.
I suspect that this implies that strongly typed languages need either static typing or an exception system, since they must either prevent invalid conversions from ever being possible or have some way to signal a runtime error when one is attempted.