Is it time to not have platform-dependent integer types in languages?

November 20, 2016

I recently linked to an article on (security) vulnerabilities that got introduced into C code just by moving it from 32-bit to 64-bit platforms. There are a number of factors that contributed to the security issues that the article covers very well, but one of them certainly is that the size of several C types varies from one platform to another. Code that was perfectly okay on a platform with 32-bit int could blow up on a platform with a 64-bit int. As an aside at the end of my entry, I wondered aloud if it was time to stop having platform-dependent integer types in future languages, which sparked a discussion in the comments.

So, let me talk about what I mean here, using Go as an example. Go has defined a set of integer types of specific sizes; they have int8, int16, int32, and int64 (and unsigned variants), all of which mean what you think they mean. Go doesn't explicitly specify a number of platform dependent issues around overflow and over-shifting variables and so on, but at least if you use a uint16 you know that you're getting exactly 16 bits of range, no more and no less, and this is the same on every platform that Go supports.

A future hypothetical language without platform-dependent integer types would have only types of this nature, where the bit size was specified from the start and was the same on all supported platforms. This doesn't mean that the language can't add more types over time; for example, we might someday want to add an int128 type to the current set. Such a language would not have a generic int type; if it had something called int, it would be specified from the start as, for example, a 32-bit integer that was functionally equivalent to int32 (although not necessarily type-equivalent).

(As such a language evolves it might also want to deprecate some types because they're increasingly hard to support on current platforms. Even before such types are formally deprecated, they're likely to be informally avoided because of their bad performance implications; important code will be rewritten or translated to avoid them and so on. However this may not be a good answer in practice and certainly even source level rewrites can open up security issues.)

The counterpoint is that this is going too far. There are a lot of practical uses for just 'a fast integer type', however large that happens to be on any particular platforms, and on top of that most new languages should be memory-safe with things like bounds-checked arrays and automatic memory handling. Explicit integer sizes don't save you from assumptions like 'no one can ever allocate more than 4 GB of memory', either.

(You might also make the case that the enabling C thing to get rid of is the complex tangle of implicit integer type conversion rules. Forcing explicit conversions all of the time helps make people more aware of the issues and also pushes people towards harmonizing types so they don't have to keep writing the code for those explicit conversions.)


Comments on this page:

You wondered about Rust in the earlier post. My understanding is there is a "usize" (and isize) type, but the idea is you're only supposed to use it for sizes. (There's a debate you can read about setting these names instead of "int"). If you want a "fast integer type", you're supposed to think a bit and then probably write i32.

I don't know how well this works in practice. Just that you can still find some complaints.

Written on 20 November 2016.
« Why I don't think subscription-based charging gets you stability
I've wound up feeling tentatively enthusiastic about Python 3 »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Nov 20 01:25:03 2016
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.