Wandering Thoughts archives

2021-11-28

The problem I have with Pip's dependency version handling

Python's Pip package manager has a system where main programs and packages can specify the general versions of dependencies that they want. When you install a program through pip (either directly into a virtual environment or with a convenient tool like pipx), pip resolves the general version specifications to specific versions of the packages and installs them too. Like many language package managers, pip follows what I'll call a maximal version selection algorithm; it chooses the highest currently available version of dependencies that satisfy all constraints. Unfortunately I have come to feel that this is a bad choice for at least programs, for two reasons. One of the reasons is general and one of them is specific to pip's current capabilities and tooling.

The general reason is that it makes the installed set of dependencies not automatically reproducible. If I install the Python LSP server today and you install it a week from now, we may well not wind up with the same total version of everything even if the Python LSP server project hasn't released a new version. All it takes is a direct or indirect dependency to release a new version that's compatible with the version restrictions in the intervening week. Your pip install will pick up that new version, following pip's maximal version selection.

This is theoretically great, since you're getting the latest and thus best versions of everything. It is not necessarily practically great, since as we've all experienced, sometimes the very latest versions of things are not in fact the best versions, or at least the best versions in the context you're using them. If nothing else, you're getting a different setup than I am, which may wind up with confusing differences in behavior.

(For instance, your Python LSP server environment might have a new useful capability that mine doesn't. You'll tell me 'just do <X>', and I'll say 'what?'.)

The specific reason is that once I have pip install my version of something, pip doesn't really seem to provide a good way to update it to the versions of everything I'd get if I reinstalled today. That way, it would at least be easy for me and you to get the same versions of everything in our installs of the Python LSP server, which would let us get rid of problems (or at least let me see your problems, if more recent package versions have new problems). Pip has some features to try to do this, but in practice they don't seem to work very well for me. I'm left to do manual inspection with 'pip list --outdated', manual upgrades of things with 'pip install --upgrade', and then use of 'pip check' afterward to make sure that I haven't screwed up and upgraded something too far.

Pip is not going to change its general approach of maximal version selection (I think only Go has been willing to go that far). But I hope that someday either pip or additional tools have a good way to bring existing installs up to what they would be if reinstalled today.

(Pipx has its 'reinstall' option, but that's a blunt hammer and I'm not sure it works in all cases. I suppose I should try it someday on my Python LSP server installation, which has various additional optional packages installed too.)

PipDependencyVersionProblem written at 23:34:38;

2021-11-05

If we use PyPy, we'll likely use our own install of it

In the past, I've said that one of our options for continuing to run Python 2 programs after Linux distributions stop packaging Python 2 at all is PyPy. As part of thinking about this, I've found that PyPy starts fast enough for our Python 2 commands and I've surveyed Linux distributions to see what versions of PyPy they packaged (in late 2020). An implicit subtext of the latter exercise, as with my periodic surveys of CPython versions on our Linux distribution, is that I was assuming we would use the distribution's packaged version of PyPy. I've now changed my mind about that.

Using a packaged version of CPython 2 is clearly the right answer. Building your own version has various hassles and problems, and you generally want at least /usr/bin/python2 to work and your distribution owns that part of the namespace. If the distribution doesn't have its own version any more, you're at least going to want to make your own package for it and install it through your distribution's package management system. Well, you don't have to, especially if you build it to /usr/local/python2.7 or /opt/python2.7 or the like, but it's probably better.

As I've found out, PyPy doesn't have to be handled like that. PyPy binary distributions are easy to install by hand because they unpack to simple, self contained directory trees that can be put anywhere and just work. Given that PyPy distributes a single package for all Linux distribution versions, a single binary version can even work across all of our various Ubuntu versions. Since Python 2.7 is frozen, this is probably not deeply important, but it's still nice to know that all of our machines would be using the same, up to date PyPy version. Since most of our Python programs already run from our central administrative filesystem, the logical place to put PyPy itself is on that filesystem.

Installing our own version of PyPy is a change from how we usually do things, but it's not very difficult and it avoids having to wrestle with both outdated versions of PyPy and different versions of PyPy (potentially with different behavior) on different versions of Ubuntu. Since using one, coherent, up to date version of PyPy is so simple, we might as well.

Despite how easy to install it would be, we will probably not switch to PyPy unless we're forced to by a lack of distribution CPython 2 packages, and probably also problems building our own. Continuing on with CPython 2 is currently the easy and safe way, and although some of our administrative programs run faster in PyPy, this isn't a particularly compelling advantage for programs that only run for a second or two anyway.

PS: Our remaining Python 2 administrative programs don't use OpenSSL, so we wouldn't be affected if there was an OpenSSL issue that required PyPy to update their bundled OpenSSL (either a security issue or, say, a TLS certificate chain handling issue). You may not be so lucky.

PyPyInstallOurOwn written at 22:42:32;


Page tools: See As Normal.
Search:
Login: Password:

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.