2021-11-28
The problem I have with Pip's dependency version handling
Python's Pip package manager has
a system where main programs and packages can specify the general
versions of dependencies that they want. When you install a program
through pip (either directly into a virtual environment or with a
convenient tool like pipx
), pip resolves the
general version specifications to specific versions of the packages
and installs them too. Like many language package managers, pip
follows what I'll call a maximal version selection algorithm; it
chooses the highest currently available version of dependencies
that satisfy all constraints. Unfortunately I have come to feel
that this is a bad choice for at least programs, for two reasons.
One of the reasons is general and one of them is specific to pip's
current capabilities and tooling.
The general reason is that it makes the installed set of dependencies not automatically reproducible. If I install the Python LSP server today and you install it a week from now, we may well not wind up with the same total version of everything even if the Python LSP server project hasn't released a new version. All it takes is a direct or indirect dependency to release a new version that's compatible with the version restrictions in the intervening week. Your pip install will pick up that new version, following pip's maximal version selection.
This is theoretically great, since you're getting the latest and thus best versions of everything. It is not necessarily practically great, since as we've all experienced, sometimes the very latest versions of things are not in fact the best versions, or at least the best versions in the context you're using them. If nothing else, you're getting a different setup than I am, which may wind up with confusing differences in behavior.
(For instance, your Python LSP server environment might have a new useful capability that mine doesn't. You'll tell me 'just do <X>', and I'll say 'what?'.)
The specific reason is that once I have pip install my version of
something, pip doesn't really seem to provide a good way to update
it to the versions of everything I'd get if I reinstalled today.
That way, it would at least be easy for me and you to get the same
versions of everything in our installs of the Python LSP server,
which would let us get rid of problems (or at least let me see your
problems, if more recent package versions have new problems). Pip
has some features to try to do this, but in
practice they don't seem to work very well for me. I'm left to do
manual inspection with 'pip list --outdated
', manual upgrades of
things with 'pip install --upgrade
', and then use of 'pip check
'
afterward to make sure that I haven't screwed up and upgraded
something too far.
Pip is not going to change its general approach of maximal version selection (I think only Go has been willing to go that far). But I hope that someday either pip or additional tools have a good way to bring existing installs up to what they would be if reinstalled today.
(Pipx has its 'reinstall' option, but that's a blunt hammer and I'm not sure it works in all cases. I suppose I should try it someday on my Python LSP server installation, which has various additional optional packages installed too.)
2021-11-05
If we use PyPy, we'll likely use our own install of it
In the past, I've said that one of our options for continuing to run Python 2 programs after Linux distributions stop packaging Python 2 at all is PyPy. As part of thinking about this, I've found that PyPy starts fast enough for our Python 2 commands and I've surveyed Linux distributions to see what versions of PyPy they packaged (in late 2020). An implicit subtext of the latter exercise, as with my periodic surveys of CPython versions on our Linux distribution, is that I was assuming we would use the distribution's packaged version of PyPy. I've now changed my mind about that.
Using a packaged version of CPython 2 is clearly the right answer.
Building your own version has various hassles and problems, and you
generally want at least /usr/bin/python2
to work and your
distribution owns that part of the namespace. If the distribution
doesn't have its own version any more, you're at least going to
want to make your own package for it and install it through your
distribution's package management system. Well, you don't have to,
especially if you build it to /usr/local/python2.7
or /opt/python2.7
or the like, but it's probably better.
As I've found out, PyPy doesn't have to be handled like that. PyPy binary distributions are easy to install by hand because they unpack to simple, self contained directory trees that can be put anywhere and just work. Given that PyPy distributes a single package for all Linux distribution versions, a single binary version can even work across all of our various Ubuntu versions. Since Python 2.7 is frozen, this is probably not deeply important, but it's still nice to know that all of our machines would be using the same, up to date PyPy version. Since most of our Python programs already run from our central administrative filesystem, the logical place to put PyPy itself is on that filesystem.
Installing our own version of PyPy is a change from how we usually do things, but it's not very difficult and it avoids having to wrestle with both outdated versions of PyPy and different versions of PyPy (potentially with different behavior) on different versions of Ubuntu. Since using one, coherent, up to date version of PyPy is so simple, we might as well.
Despite how easy to install it would be, we will probably not switch to PyPy unless we're forced to by a lack of distribution CPython 2 packages, and probably also problems building our own. Continuing on with CPython 2 is currently the easy and safe way, and although some of our administrative programs run faster in PyPy, this isn't a particularly compelling advantage for programs that only run for a second or two anyway.
PS: Our remaining Python 2 administrative programs don't use OpenSSL, so we wouldn't be affected if there was an OpenSSL issue that required PyPy to update their bundled OpenSSL (either a security issue or, say, a TLS certificate chain handling issue). You may not be so lucky.