Being realistic about what we're going to do with our Django app
One of our biggest problem points for moving away from Python 2 is our Django app, which handles all of the workflow when people request new accounts. Back in last August I wrote about how it needed tests, and then in February I wrote about that again, and now it is almost July and guess what, our app still has no tests. There is a pattern here, and given that pattern I think it's time for me to get realistic about what we're going to do with our app in next few years and how that's going to work. Being realistic doesn't leave me with pleasant answers, but at least I can try to be honest with myself for once instead of pretending.
(The problem with pretending is that I wind up not preparing for what actually happens.)
Our app is currently running on an Ubuntu 18.04 machine under Python 2 and mod_wsgi. This combination can keep running until early 2023 and we're going to do that unless there is a critical reason not to do so. By mid 2022 we should know whether or not Ubuntu 22.04 LTS will allow us to keep on running the Python 2 version with mod_wsgi; if it can, we will quite likely continue on with that until mid 2026 makes this issue something we can't ignore any more. At this point, keeping the app Python 2 until Ubuntu 18.04 support runs out is basic realism; it seems pretty unlikely that I will get around to porting the app to Python 3 in the remaining five months or so of 2019.
(We could probably switch to CentOS 8 for even longer support of Python 2, but this particular app is not worth going to that much effort and annoyance.)
At this point everyone notes that the last version of Django that supports Python 2 is 1.11, and support for 1.11 runs out at the end of this year. This is a good argument in theory, but in practice we are already running on an unsupported Django version, as we are back at Django 1.10.7 at the moment (as we have been since 2017 because Django updates are a pain at the best of times). Running an unsupported version of Django is nothing new for us; instead, it's unfortunately become the default state of affairs. I want to try to update the application to Django 1.11 at some point for various hand waving reasons, which hopefully won't be too much work. Possibly this means that we should switch to using the Ubuntu 18.04 packaged version of Django 1.11, even though I didn't think that was a good idea last November. If we're going to run an unsupported Django, it might as well be a version that someone might be keeping an eye on.
Does this present a security risk? Somewhat, but my view is that it's a relatively low one. Almost all of the web app is locked away behind Apache's HTTP basic authentication and restricted to a small number of trusted users only (and the Django admin interface is even more restricted). The exposed app surface is relatively low and relatively simple; we have a couple of basic forms and that's it (and one endpoint for AJAX that gives a yes/no answer to whether or not something is an available Unix login). Also, nothing permanent is done automatically by the app; a human is always in the loop before an account is actually created.
(It's possible that a Django vulnerability could be leveraged to attack other web things through our app, through CSRF or the like. But that would be a pretty targeted attack against the department by someone who would have to know a fair bit about how the app works, who uses it, and what else they interact with that can be attacked. Obviously the catastrophic scenario would be a remote code execution flaw that could be exploited through a basic URL view or form submission, but that seems unlikely.)
Wanting to write Django tests doesn't seem to have done much good, so my alternate plan for a Python 3 port is simply to try running our web app under Python 3, probably with Django 1.11 to keep things simple. If and when I find code that should be modernized anyway or changes that still keep things compatible with Python 2, I can fix them in the production codebase to make it more and more ready for Python 3. My hope is that a great deal of this can be done with clean changes that do not have to be conditional on Python 2 versus Python 3 but are simply good ideas in general. My hope is that the simplicity of our application combined with Django handling a lot of stuff for us behind the scene will lead to most things just working, so running it under Python 3 will mostly just work. We won't have the assurance that tests would give us, but in practice I can manually exercise things and declare the result good enough.
One big issue for Python 3 code is character set conversion and especially points where Python 3's automatic conversions can fail on you. For this, we're going to punt. I'm not going to try to harden the application to deal with character set decoding problems with the few data files that it reads; in our environment we can guarantee that they're always ASCII and so will always decode correctly. Similarly, we're always going to encode to the system default of UTF-8 when writing out files, which means that it too always works. Hopefully this means that I can ignore almost all of those issues in the Python 3 version of the app, which is what the Python 2 version is already doing.
(There are some places where I will want to require ASCII, but they're already points where I should be doing that, like the Unix login name that people choose, and so I should add these checks to the current version of the application.)
This will probably leave the Python 3 version of the application vulnerable to throwing exceptions if people put in weird characters in forms or do other things, but if that happens we actually don't care too much. The app is not used much (people don't request accounts all that often), and it's not too critical an issue if the app's not working for a few days while we fix the code to be more defensive or de-mangle things from its tiny little database.
(The app's database is so small that if we have to, we can dump it to plain text, edit the plain text, and recreate a new db from that. It is, naturally, a SQLite database.)
All of this is setting a relatively low quality standard for the eventual Python 3 version, but at this point that's realism. The app is neither a high enough priority nor interesting enough for us to do it any better, not unless I suddenly get a vast gulf of free time with nothing else to work on.
PS: Facing up to reality here has also made me realize some things about Django and us, but that's for another entry.
The convenience (for me) of people writing commands in Python
The other day I was exploring Certbot,
which is more or less the standard and 'as official as it ever gets'
client for Let's Encrypt, and it did something that I objected to.
Certbot is a very big program with a great many commands, modes,
options, settings, and so on, and this was the kind of thing where
I wasn't completely confident there even was a way to disable it.
However, sometimes I'm a system programmer and the particular thing
had printed a distinctive message. So, off to the source code I went
grep (okay, ripgrep),
to find the message string and work backward from there.
Conveniently, Certbot is written in Python, which has two advantages here. The first advantage is that I actually know Python, which makes it easier to follow any logic I need to follow. The second is that Python programs intrinsically come with their source code, just as the standard library does. Certbot is open source and I was installing Ubuntu's official package for it, which gave me at least two ways of getting the source code, but there's nothing like not even having to go to the effort.
(And then there's WebAssembly.)
Another cultural aspect of this is that a lot of commands written
in Python are written in relatively straightforward ways that are
easy to follow; you can usually
grep through the code for what
function something is in, then what calls that function, and so on
and so forth. This is not a given and it's quite possible to create
hard to follow tangles of magic (I've sort of done this in the
past) or a tower of classes inside
classes that are called through hard to follow patterns of delegation,
object instantiation, and so on. But it's at least unusual, especially
in relatively straightforward commands and in code bases that aren't
PS: Certbot is on the edge of 'large' here, but for what I was looking for it was still functions calling functions.
PPS: That installing a Python thing gives you a bunch of
on your filesystem is not a completely sure thing. I believe that
there are Python package and module distribution formats that don't
.py files but leave them all bundled up, although the
current Wheel format is apparently purely for distribution, not
running in place.
I am out of touch with the state of Python package distribution,
so I don't know how this goes if you install things yourself.