One impact of the dropping of Python 2 from Linux distributions

March 3, 2020

Due to uncertainty over the future of the Python 2 interpreter in future Linux distributions, I've been looking at some of our Python 2 code, especially the larger programs. This caused me to express some views over on Twitter, which came out long enough that I'm recycling them here with additional commentary:

Everyone's insistence on getting rid of Python 2 is magically transforming all of this perfectly functional and useful Python 2 code we have from an asset to a liability. You can imagine how I feel about that.

Functioning code that you don't have to maintain and that just works is an asset; it sits there, doing a valuable job, and requires no work. Code that you have to do significant work on just so that it doesn't break (not to add any features) is a liability; you have to do work and inject risk and you get nothing for it.

Some code is straightforward to lift to Python 3 because it doesn't do anything complicated. Some code is not like that:

Today's 'what am I going to do about this' Python 2 code is my client implementation of the Sendmail milter protocol, which is all about manipulating strings as binary over a network connection. I guess I shotgun b"..." and then start guessing.

My milter implementation has been completely stable since written in Python 2 in 2011. Now I have to destabilize it because people are taking Python 2 away.

(I do not have tests. Tests would require another milter implementation that was known to be correct.)

(What I meant by the end of the first tweet is making various strings into bytestrings, especially protocol literals, and trying to push that through the protocol handling.)

As a side note, testing protocol implementations is hard when you don't have some sort of reference version that you can embed in your tests, even if you implement both the client and the server side. Talking to yourself doesn't insure that you haven't made some mistake, either in the initial implementation or in a translation into the Python 3 world of bytestrings and Unicode strings and trying to handle network IO in that world and so on.

(For instance, since UTF-8 can encode every codepoint you can put into a Unicode string, including control characters and so on, you could write an encoder and decoder that actually operated on Unicode strings without you realizing, then have Python 3's magic string handling convert them to UTF-8 over the wire as you sent them back and forth between yourself during tests. Your implementation would talk to itself, but not to any outside version that did not UTF-8 encode what were supposed to be raw bytes. You could even pass tests against golden pre-encoded protocol messages if they were embedded in your Python test code and you forgot that you needed to turn them into bytestrings.)

I also had an opinion on the idea that we've known this for a while and it's just a cost of using Python:

Python 2 is only legacy through fiat (multiple fiats, both the main CPython developers and then OS distributions). Otherwise it is perfectly functional and almost certainly completely secure, and would keep running fine for a great deal longer.

Just because software is not being updated doesn't mean that it stops working. If people would leave Python 2 alone (and keep it available in Linux distributions as a low-support or unsupported package, like so many others), it would likely keep going on fine for years, but because they won't, our Python 2 code is steadily being converted from an asset to a liability. Of course, part of the fun is that we don't even know for sure if people will be getting rid of the Python 2 interpreter itself, much less a timetable for it.

(Maybe the current statements from Debian and Ubuntu are supposed to answer that question, but if so they're not clear to me and they certainly don't give a timeline for when the Python 2 interpreter itself will be gone.)

PS: All of this is completely separate from the virtues of Python 3 for new code, where I default to it over some other options in our environment.

Written on 03 March 2020.
« More or less what versions of Go support what OpenBSD releases (as of March 2020)
Unix's iowait% is a narrow and limited measure that can be misleading »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Tue Mar 3 21:36:15 2020
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.