Is it time to not have platform-dependent integer types in languages?
I recently linked to
an article on (security) vulnerabilities that got introduced into
C code just by moving it from 32-bit to 64-bit platforms. There are
a number of factors that contributed to the security issues that
the article covers very well, but one of them certainly is that the
size of several C types varies from one platform to another. Code
that was perfectly okay on a platform with 32-bit
int could blow
up on a platform with a 64-bit
int. As an aside at the end of
my entry, I wondered
aloud if it was time to stop having platform-dependent integer types
in future languages, which sparked a discussion in the comments.
So, let me talk about what I mean here, using Go as an example. Go
has defined a set of integer types of specific sizes; they have
int64 (and unsigned variants), all
of which mean what you think they mean. Go doesn't explicitly specify
a number of platform dependent issues around overflow and over-shifting
variables and so on, but at least if you use a
uint16 you know
that you're getting exactly 16 bits of range, no more and no less,
and this is the same on every platform that Go supports.
A future hypothetical language without platform-dependent integer
types would have only types of this nature, where the bit size was
specified from the start and was the same on all supported platforms.
This doesn't mean that the language can't add more types over time;
for example, we might someday want to add an
int128 type to the
current set. Such a language would not have a generic
if it had something called
int, it would be specified from the
start as, for example, a 32-bit integer that was functionally
int32 (although not necessarily type-equivalent).
(As such a language evolves it might also want to deprecate some types because they're increasingly hard to support on current platforms. Even before such types are formally deprecated, they're likely to be informally avoided because of their bad performance implications; important code will be rewritten or translated to avoid them and so on. However this may not be a good answer in practice and certainly even source level rewrites can open up security issues.)
The counterpoint is that this is going too far. There are a lot of practical uses for just 'a fast integer type', however large that happens to be on any particular platforms, and on top of that most new languages should be memory-safe with things like bounds-checked arrays and automatic memory handling. Explicit integer sizes don't save you from assumptions like 'no one can ever allocate more than 4 GB of memory', either.
(You might also make the case that the enabling C thing to get rid of is the complex tangle of implicit integer type conversion rules. Forcing explicit conversions all of the time helps make people more aware of the issues and also pushes people towards harmonizing types so they don't have to keep writing the code for those explicit conversions.)