Wandering Thoughts

2021-10-22

Python 2's status in various Linux distributions (October 2021 edition)

One of the slow things happening with Python 2 in its afterlife is that various Linux distributions are sort of maybe trying to remove the Python 2 interpreter itself (which we could call 'CPython 2'). Their official position is generally that they'll remove it sometime, but the practical result is that no one seems to be making much gestures in that direction.

Debian's Python page says that "[Python 2] is being removed as of Debian 11 (Bullseye)". This information is visibly out of date. Debian 11 was released in the middle of August and not only does it still have a Python 2 package, so do the current testing and unstable distributions (you can see the state of the 'python2' package here). Possibly the Debian page means something different, such as all packages depending on Python 2 will be gone in Debian 11, but if so it's oddly written.

Similarly, Ubuntu 21.10 was just released last week and unsurprisingly, it also has a Python 2 package (since Ubuntu draws packages from Debian, Debian packages are normally also in Ubuntu as well). The continued presence of Python 2 in Debian makes it pretty likely that Python 2 will also be in Ubuntu 22.04 LTS when it comes out next April. Since Python 2 is already in the 'universe' repository with no promises of support, the lack of official support for Python 2 is probably not going to make a difference in this.

(It turns out that Python 2 has been in Ubuntu's 'universe' repository for some time, even back to Ubuntu 18.04. Some version of Python 3 is generally in their officially more supported 'main' repository.)

Fedora 34 is the current version of Fedora, and it has Python 2. Fedora 35 is not yet out but it's very close and it still has Python 2 in its package sets. On a casual Internet search, I can't spot a Fedora web page about its future plans here but I wouldn't be surprised if Fedora's current Python 2 package lingers on for years to come.

All of Debian, Ubuntu, and Fedora have opted to package PyPy's version of Python 2 as 'pypy' and their version of Python 3 as 'pypy3'. At one point PyPy considered the Python 2 implementation their primary version, although I'm not sure they do any more (their features page doesn't say anything about primary versions, for example). PyPy will probably always support Python 2, but they probably won't consider it the primary version forever. Linux distributions may change the naming before then, of course, depending on user expectations.

Python2LinuxStatus-2021-10 written at 22:40:32; Add Comment

2021-10-05

Some early notes on using pipx for managing third-party Python programs

Recently I wrote about how I should use virtual environments for third party programs, and said that I would likely wind up trying pipx. Today, I ran 'pip3 list --user --outdated', stared at the output of that and 'pip3 list --user', and decided that my situation in my ~/.local had reached enough critical mass that I was willing to burn it down and start over from scratch. So here are some early notes on pipx, after experimenting with it on my laptop and then switching to it on my home desktop.

In general, pipx is pleasantly straightforward to use in a basic way. Running 'pipx install mypy' (for example) creates a .local/pipx/venvs/mypy virtual environment, installs things into it, and then creates symlinks in .local/bin to the various programs mypy normally installs. Pipx adds a 'pipx_metadata.json' file to the venv with various pieces of information about what the virtual environment contains. The venvs appear to be named after the main package you installed with 'pipx install'.

One of my major uses of pipx is to install the Python LSP server, which has two additional things. The first is that you normally install it with some additional pieces specified, not just as a plain package name, for example:

pipx install 'python-lsp-server[rope,yapf]'

Pipx coped with this package specification fine, and called the venv 'python-lsp-server'. The second thing is that there are additional third party plugins you may want to add, like mypy-ls; these need to be installed into the venv somehow. In pipx, this is done with 'pipx inject <package>', eg 'pipx inject mypy-ls'.

All of this appears to be recorded in pipx_metadata.json, so I can hope that 'pipx reinstall <...>' will correctly reinstall all of my Python LSP setup, without me having to remember exactly what I added (I haven't tested this yet, and the manual page is ambiguous).

I'm not sure how you're supposed to look for upgrades to the things you have installed. I suppose one option is 'pipx upgrade' or 'pipx upgrade-all' and just letting it do things, and another is to directly ask pip, using 'pipx runpip <package> list --outdated'. The last is probably the only way to find out about dependencies with new versions you may want, which pip doesn't normally upgrade anyway. One brute force way of dealing with the general issues of upgrading programs with pip is to just periodically run 'pipx reinstall <...>', which should give you a venv that is as up to date as possible at the expense of a certain amount of overhead.

Overall, even if pipx isn't perfect it's a lot better than the mess that was my old ~/.local and actually using it is sufficiently easy and non-annoying that I'm willing to put up with minor flaws if it has them.

PipxEarlyNotes written at 23:59:15; Add Comment

2021-09-16

Use virtual environments to install third-party Python programs from PyPI

I've historically not been a fan of Python's virtual environments. While they're easy to set up these days, they're relatively heavyweight things (taking up tens of megabytes), they contain embedded references to the Python version they were built with, and it felt like vast overkill for simply installing a program (such as the Python LSP server and the third party packages it required from PyPI. So I've historically used Pip's "user" install mode ('pip install --user'), which puts everything into your $HOME/.local directory tree. However, I now feel that this is a mistake (although an attractive one). Instead, you should create virtual environments for any third party commands you install from PyPI or elsewhere.

The problem is that Pip's "user" mode involves pretending that Pip is basically a Unix distribution's package manager that just happens to be operating on your $HOME/.local. This is an attractive illusion and it sort of works, but in practice you run into issues over time when you upgrade things, especially if you have more than one program installed. You'll experience some of these issues with virtual environments as well, but with single purpose virtual environments (one venv per program) and keeping track of what you installed, the ultimate brute force solution is to delete and recreate the particular virtual environment. The dependency versions are getting tangled? Delete and recreate. You've moved to a new distribution version of Python (perhaps you've upgraded from one Ubuntu LTS to another)? It sounds like a good time to delete and recreate, rather than dealing with version issues..

More broadly, it feels to me that the Python packaging world is moving strongly toward using virtual environments as the solution to everything. As a result, I don't expect fundamental tools like Pip to spend much development effort on improving management of "user" mode installs. If anything, I expect Pip's user install mode to either quietly decay over time or to get deprecated at some point.

Since I've only recently come around to this view (after actively investigating the situation around upgrading programs with pip), I have no opinions on any of the programs and systems that are designed to make this easier. Pipx was mentioned in a comment by Tom on yesterday's entry, so I'll probably look at that first.

(I do think there are some uses for a Pip "user" install of PyPI packages, but that's for another entry.)

VenvsForPrograms written at 23:19:48; Add Comment

2021-09-15

Some notes on upgrading programs with Python's pip

My primary use of Python's pip package manager is to install programs like the Python LSP server; I may install these into either a contained environment (a virtual environment or a PyPy one) or as a user package with 'pip install --user'. In either case, the day will come when there's a new version of the Python LSP server (or whatever) and I want to update to it. As I noted down back in my pip cheatsheet, the basic command I want here is 'pip install --upgrade <package>', possibly with '--user' as well. However, it turns out that there are some complexities and issues here, which ultimately come about because pip is not the same sort of package manager as Fedora's DNF or Debian's apt.

The conventional way a Unix package manager such as DNF operates is that when you ask it to upgrade things, it upgrades everything that has new versions available. Pip doesn't behave this way. By default, it only upgrades the package that you asked it to, and doesn't upgrades any dependencies unless what you currently have doesn't satisfy the requirements of the new version of the upgraded package. Over enough time, this will often give you significantly out of date dependencies where you're missing out on improvements and bug fixes they've made even if your main package is theoretically fine with the old versions you have installed.

(This generally also means that the versions you've wound up with don't match what you'd get if you deleted everything and reinstalled from scratch, assuming you kept track of what top level packages you installed.)

This can be controlled (to some degree) by the --upgrade-strategy option to 'pip install'. This can be used to switch pip to an "eager" upgrade strategy, where it also upgrades dependencies to the latest available version that satisfies the requirements. You can set this as a default through pip's configuration files. However, this eager upgrading process is not flawless (as pip error messages may occasionally point out); you may wind up with versions of dependencies that satisfy the program, but not other things you have sitting around (including as globally installed modules from your operating system). Despite this, I've set up a ~/.config/pip/pip.conf to make the eager mode my default, because it's more like how I want package management for programs to work; normally I want to be using the latest and best version of everything.

(Usefully, you can tell if your pip.conf is working by what 'pip install --help' reports as the default for --upgrade-strategy.)

A Unix package manager like apt (almost) always removes the older version of a package when you upgrade to a newer one, and in fact many Unix packages can't have multiple versions installed at once. Pip somewhat apes this, but under some circumstances you can apparently wind up with older versions still present on disk. This especially matters if you decide to uninstall a package, because 'pip uninstall' seems to only remove the most recent version (what 'pip list' will show you as the version). If pip has left multiple versions sitting around, you may need a number of 'pip uninstall' invocations to get the package gone from 'pip list'. Alternatively, you can go to the appropriate location (such as '~/.local/lib/python3.X/site-packages') and manually remove all of the directories.

PipUpgradingPrograms written at 22:42:33; Add Comment

2021-08-22

What we'll likely do when Linux distributions drop Python 2 entirely

Currently, Linux distributions claim that they want to stop including even a minimal Python 2 at some point, although when that will be isn't clear (the latest Debian and Ubuntu in development versions both seem to still include it). Since we have any number of perfectly functional small Python 2 programs used in managing our systems (and no particular time or enthusiasm for porting them all to Python 3), this presents a potential future problem, never mind the disruption for our users (who may also have such programs). Thus, it seems likely that we will roll our own version of Python 2 even after Linux distributions stop doing so.

Our traditional way of dealing with dropped packages is to simply save a copy of the binary Ubuntu package from the last Ubuntu version (or LTS version) that supported it, then install it ourselves by hand; one can do a similar thing with binary RPMs for Fedora. Unfortunately this isn't really sustainable for Python in specific, because Python uses various shared libraries that keep changing their shared library version and which distributions often don't provide compatibility packages for.

(One big offender is libreadline, which is normally used by interactive python2 through the readline module, but there probably are others.)

This leaves either hand-building and manually installing Python 2 through the old fashioned 'make install' approach, or rebuilding the last distribution source packages ourselves on new distribution versions. The 'make install' approach is brute force and wouldn't naturally give us a /usr/bin/python2 symlink, but we could add that by hand, and we could probably create an install (say into /opt) that only required a tarball to be unpacked on each machine. Rebuilding Ubuntu source packages is somewhat more annoying than I'd like, but it's generally feasible unless they specify narrow ranges of build-time dependencies. I took a look at the current Ubuntu source package for Python 2, and while it looks a little tangled I think it will probably rebuild without problems (at least for a while). So most likely we'll start out by rebuilding the source package and only switch to a fully by hand approach if package rebuilding falls apart.

Various standard Python modules written in C make calls to various third party C APIs; I've already mentioned readline, but there are also (for example) the various database access modules. These APIs are generally quite stable, but this isn't the same thing as "frozen", so at some point in the more distant future some standard Python 2 modules might require changes to keep building and working. If the changes are easy, we might patch our Python 2; if they're more difficult, we might have to start dropping standard modules.

For various reasons, we might not want to keep a /usr/bin/python2 symlink at all, at least not over the long term, even if we continue to have Python 2 itself. It's one thing to support Python 2 for our own programs, with their limited needs for standard modules and so on; it's another thing entirely to have to support Python 2 for our general user population. As a result, we might someday want to drop support for the user population (ie, /usr/bin/python2) without dropping our own support. This might push us to switching from rebuilt source packages to a hand install.

PS: Of course another option is migration to PyPy, which will probably always support Python 2 and is certainly feasible for our own Python 2 programs. But our users will almost certainly want us to provide Python 2 for longer than Ubuntu does, so some degree of do it ourselves Python 2 is likely in our future. And while PyPy is a good implementation of Python 2, but it's not really a substitute for a CPython /usr/bin/python2.

Python2WithoutDistros written at 23:57:10; Add Comment

Setting up Python LSP support in GNU Emacs is reasonably worth it

When I initially set up GNU Emacs LSP support for Python, I wasn't sure if the effort was going to be worth it for my Python programming (which is currently mostly not using type hints). Although I don't edit a lot of Python these days, I've come to believe that using the current Python LSP server is worth the effort to set it up, although it's not clearly a win the way it is for more static languages like Go.

(I'm not sure I'd set up an Emacs LSP environment if I was only editing Python, but if you have an LSP environment already, adding Python to it isn't much work.)

My most common LSP operations while editing Python code are 'go to definition' and 'see references/uses'. The former is available in some form in the basic Emacs Python mode, although I think it's not as fast (and perhaps not as good). The LSP mode can pop up help about a particular function or method under many circumstances, but often I have to wait a while for it to figure things out. Never the less it does a better job of things like finding the right methods than I would have expected.

(For me and my type hint free Python, LSP doesn't seem to do much useful completion, which is normally a big win in other LSP languages.)

GNU Emacs with the LSP mode does a number of helpful passive things, such as flagging "lint" level problems (sometimes too many of them, depending on what you have enabled) and more serious issues like mismatched types of indentation. The basic Python mode probably does some of this, but not all (at least without extra integrations). The LSP mode offers more features and options than this, but I haven't needed to dip into them yet.

(One of the things that writing this entry is showing me is that I didn't pay much attention to what the basic Python mode could do for me. I doubt I ever tried 'go to definition' before I had LSP mode set up.)

More broadly, I think that the Python LSP server and LSP mode is the most likely future of smart Python editing in GNU Emacs, so I might as well get on board now. I suspect that more people will be putting more effort into both the Python LSP server and LSP support in GNU Emacs than are putting effort into making the GNU Emacs Python mode smarter.

If all of this sounds sort of lukewarm, that's a relatively good reflection of my feelings. I think that GNU Emacs LSP support is a clear win for some languages (my example is Go), but doesn't get you as much for Python and probably other dynamic languages like it. Python with type hints might unlock more LSP-based power.

One drawback of using the LSP mode is that I want to be using Emacs under X, not in a terminal. This is doable even in these days of working from home (I can start a SSH session that forwards X), but it adds an extra step and a bit of friction. Editing in text mode is still workable, but it reduces the LSP mode's power.

PS: Possibly there are tweaks to the Python LSP server configuration, my Emacs LSP configuration, or my Python setup that would give me a better Python LSP experience. Since I don't spend much time editing Python code these days, as mentioned, I haven't tried very hard to improve my environment. Possibly this is a mistake.

PythonEmacsLSPWorthIt written at 01:07:03; Add Comment

2021-07-22

Apache's mod_wsgi and the Python 2 issue it creates

If you use Apache (as we do) and have relatively casual WSGI-based applications (again, as we do), then Apache's mod_wsgi is often the easiest way to deploy your WSGI application. Speaking as a system administrator, it's quite appealing to not have to manage a separate configuration and a separate daemon (and I still get process separation and different UIDs). But at the moment there is a little problem, at least for people (like us) who use their Unix distribution's provided version of Apache and mod_wsgi rather than build your own. The problem is that any given build of mod_wsgi only supports one version of (C)Python.

(Mod_wsgi contains an embedded CPython interpreter, although generally it's not literally embedded; instead mod_wsgi is linked to the appropriate libpython shared library.)

In the glorious future there will only be (some version of) Python 3, and this will not be an issue. All of your WSGI programs will be Python 3, mod_wsgi will use some version of Python 3, and everything will be relatively harmonious. In the current world, there is still a mixture of Python 2 and Python 3, and if you want to run a WSGI based program written in a different version of Python than your mod_wsgi supports, you will be sad. As a corollary of this, you just can't run both Python 2 and Python 3 WSGI applications under mod_wsgi in a single Apache.

Some distributions have both Python 2 and Python 3 versions of mod_wsgi available; this is the case for Ubuntu 20.04 (which answers something I wondered about last January). This at least lets you pick whether you're going to run Python 2 or Python 3 WSGI applications on any given system. Hopefully no current Unix restricts itself to only a Python 2 mod_wsgi, since there's an increasing number of WSGI frameworks that only run under Python 3.

(For example, Django last supported Python 2 in 1.11 LTS, which is no longer supported; support stopped some time last year.)

PS: Since I just looked it up, CentOS 7 has a Python 3 version of mod_wsgi in EPEL, and Ubuntu 18.04 has a Python 3 version in the standard repositories.

Python2ApacheWsgiIssue written at 23:56:04; Add Comment

2021-07-08

A semi-surprise with Python's urllib.parse and partial URLs

One of the nice things about urllib.parse (and its Python 2 equivalent) is that it will deal with partial URLs as well as full URLs. This is convenient because there are various situations in a web server context where you may get either partial URLs or full URLs, and you'd like to decode both of them in order to extract various pieces of information (primarily the path, since that's all you can reliably count on being present in a partial URL). However, URLs are tricky things once you peek under the hood; see, for example, URLs: It's complicated.... A proper URL parser needs to deal with that full complexity, and that means that it hides a surprise about how relative URLs will be interpreted.

Suppose, for example, that you're parsing an Apache REQUEST_URI to extract the request's path. You have to actually parse the request's URI to get this, because funny people can send you full URLs in HTTP GET requests, which Apache will pass through to you. Now suppose someone accidentally creates a URL for a web page of yours that looks like 'https://example.org//your/page/url' (with two slashes after the host instead of one) and visits it, and you attempt to decode the result of what Apache will hand you:

>>> urllib.parse.urlparse("//your/page/url")
ParseResult(scheme='', netloc='your', path='/page/url', params='', query='', fragment='')

The problem here is that '//ahost.org/some/path' is a perfectly legal protocol-relative URL, so that's what urllib.parse will produce when you give it something that looks like one, which is to say something that starts with '//'. Because we know where it came from, you and I know that this is a relative URL with an extra / at the front, but urlparse() can't make that assumption and there's no way to limit its standard-compliant generality.

If this is an issue for you (as it was for me recently), probably the best thing you can do is check for a leading '//' before you call urlparse() and turn it into just '/' (the simple way is to just strip off the first character in the string). Doing anything more complicated feels like it's too close to trying to actually understand URLs, which is the very job we want to delegate to urlparse() because it's complicated.

PS: Because I tested it just now, the result of giving urlparse() a relative URL that starts with three or more slashes is that it's interpreted as a relative URL, not a protocol-relative URL. The path of the result will have the extra leading slashes stripped off.

UrllibParsePartialURLs written at 00:10:39; Add Comment

2021-06-28

I should keep track of what Python packages I install through pip

These days I'm increasingly making use of installing Python packages through pip, whether this is into a PyPy environment or with 'pip install --user' for things like python-lsp-server. Having done this for a while, complete with trying to keep up with potential package upgrades, I've come to the conclusion that I should explicitly keep track of what packages I install, recording this in some place I can find it again.

There are two problems (or issues) that push me to this. The first is that as far as I know, Pip doesn't keep track of a distinction between packages that you've asked it to install and the dependencies of those packages. All of the packages show up in 'pip list', and any can show up in 'pip list --outdated'. My understanding is that in the normal, expected use of Pip you'll keep track of this in your project in a requirements file, then use that to build the project's virtualenv. This is not really the model of installing commands, especially commands like python-lsp-server that have install time options.

The second issue is that Pip installed packages are implicitly for a specific version of Python. If you rely on the system Python (instead of your own version) and that version gets upgraded, suddenly 'pip list' will report nothing (and you will in fact have no packages available). At this point you need to somehow recover the list of installed packages and re-install all of them (unless you resort to unclean hacks). Explicitly keeping track of this list in advance is easier than having to dig it out at the time.

Having an explicit list helps in other situations. Perhaps you started out installing all of your tools under CPython, but now you want to see how well they'll work under PyPy. Perhaps you're building a new PyPy based environment with a new version of PyPy and want to start over from scratch. Perhaps you think package versions and dependencies have gotten snarled and you're carrying surplus packages, so you want to delete everything and start over from scratch.

(Starting over from scratch can also be the easiest way to get the best version of dependencies, since the packages you're directly installing may have maximum version constraints that will trip you up if you just directly 'pip install --upgrade ...' dependencies.)

PS: Possibly there's ways to do all of this with Pip today, especially things like 'upgrade this and all of its dependencies to the most recent versions that are acceptable'. I'm not well versed in Pip, since mostly I use it as a program installer.

TrackingPipInstalls written at 00:05:23; Add Comment

2021-06-10

Early notes on using the new python-lsp-server (pylsp) in GNU Emacs

When I started with LSP-based Python editing in GNU Emacs, the Python LSP server was pyls. However, pyls is apparently now unmaintained and the new replacement is python-lsp-server, also known as 'pylsp'. I noticed this recently when I looked into type hints a bit, and then when I was editing some Python today, lsp-mode or some sub-component nagged me about it:

Warning (emacs): The palantir python-language-server (pyls) is unmaintained; a maintained fork is the python-lsp-server (pylsp) project; you can install it with pip via: pip install python-lsp-server

The first thing to note about pylsp is that it supports Python 3 only (it says Python 3.6+). It works to some degree if you edit Python 2 code, but I don't fully trust it, so I'm keeping around my current Python 2 version of the older pyls. Pyls may be unmaintained, but at the moment it appears to work okay.

Because of the Python 3 versus 2 issue, I already had a front end 'pyls' script to try to figure out which Python's version of pyls I needed to run. Fortunately pyls and pylsp are currently invoked with the same (lack of) arguments, so I cheated by renaming this script to 'pylsp' and having it run the real pylsp for Python 3 and fall back to pyls for Python 2.

(I opted to uninstall the Python 3 version of pyls before I installed pylsp, because I didn't want to find out if they had conflicting version requirements for some shared dependencies.)

Lately I've been running pyls under PyPy, so I started out by installing pylsp this way too. Pylsp (like pyls before it) has some useful looking third party plugins and since I was installing from scratch now I decided to install some of them, including mypy-ls. This is when I found out that unfortunately mypy doesn't run under PyPy. So I switched to installing pylsp using CPython with my usual 'pip3 install --user'. This worked for pylsp itself and mypy-ls, but pyls-memestra had issues due to memestra basically having to be installed in a virtualenv or other personal install of CPython or PyPy. I dealt with this by removing pyls-memestra; it might be nice but it's not essential.

(Memestra attempts to make a directory under sys.prefix, which is owned by root if you're running the system CPython or PyPy.)

The result appears to work fine and has no more warning messages in various internal Emacs LSP buffers than I expect, but I haven't used it extensively yet. I'm not sure I'll keep mypy-ls yet, because it does add some extra warnings in some situations. The warnings are valid ones if you're using type annotations, but a potential problem if you're not. Probably it's good for me to get the warnings and maybe start fixing them.

PythonPylspNotes written at 01:25:10; Add Comment

(Previous 10 or go back to May 2021 at 2021/05/20)

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.