2022-03-29
Fixing Pipx when you upgrade your system Python version
If you use your system's Python for pipx and then upgrade your system and its version of Python, pipx can have a bad problem that renders your pipx managed virtual environments more or less unrecoverable if you do the wrong thing. Fortunately there turns out to be a way around it, which I tested as part of upgrading my office desktop to Fedora 35 today.
Pipx's problem is that it stashes a bunch of stuff in a ~/.local/pipx/shared virtual environment that depends on the Python version. If this virtual environment exists but doesn't work in the new version of Python that pipx is now running with, pipx fails badly. However, pipx will rebuild this virtual environment any time it needs it, and once rebuilt, the new virtual environment works.
So the workaround is to delete the virtual environment, run a pipx command to get pipx to rebuild it, and then tell pipx to reinstall all your pipx environments. You need to do this after you've upgraded your system (or your Python version). What you do is more or less:
# get rid of the shared venv rm -rf ~/.local/pipx/shared # get pipx to re-create it pipx list # have pipx fix all of your venvs pipx reinstall-all
Perhaps there is an easier way to fix up all of your pipx managed
virtual environments other than 'pipx reinstall-all
', but that's
what I went with after my Fedora 35 upgrade and it worked. In any
case, I feel that it's not a bad idea to recreate pipx managed virtual
environments from scratch every so often just to clean out any lingering
cruft.
(It also seems unlikely that there is any better way in general. In one way or another, all of the Python packages have to get reinstalled under the new version of Python. Sometimes you can do this by just renaming files, but any package with a compiled component may need (much) more work. Actually doing the pip installation all over again insures that all of this gets done right, with no hacks that might fail.)
2022-03-19
Some problems that Python's cgi.FieldStorage
has
In my entry on our limited use of the cgi
module,
I praised cgi.FieldStorage
as a nice simple way to write Python CGIs that deal with parameters,
especially for POST
forms. Unfortunately there are some dark sides
to cgi.FieldStorage
(apart from any bugs it may have), and in
fairness I should discuss them. Overall, cgi.FieldStorage
is
probably safe for internal usage, but I would be a bit wary of
exposing it to the Internet in hostile circumstances. The ultimate
problem is that in the name of convenience and just working,
cgi.FieldStorage
is pretty trusting of its input, and on the
general web one of the big rules of security is that your input is
entirely under the control of an attacker.
So here are some of the problems that cgi.FieldStorage
has if you
expose it to hostile parties. The first broad issue is that FieldStorage
doesn't have any limits:
- it allows people to upload files to you, whether or not you
expected this; the files are written to the local filesystem.
Modern versions of
FieldStorage
do at least delete the files when the Python garbage collector destroys the FieldStorage object. - it has no limits on how large a
POST
body it will accept or how long it will wait to read aPOST
body in (or how long it will wait to upload files). Some web server CGI environments may impose their own limits on these, especially time, but an attacker can probably at least flood your memory.(The FieldStorage init function does have some parameters that could be used to engineer some limits, with additional work like wrapping standard input in a file-like thing that imposes size and time limits. For size limits you can also pre-check the Content-Length.)
Then there is the general problem that GET
and POST
parameters
are not actually really like a Python dict (or any language's form of
it). All dictionary like things require unique keys, but attackers
are free to feed you duplicate ones in their requests. FieldStorage's
behavior here is not well defined, but it probably takes the last
version of any given parameter as the true one. If something else
in your software stack has a different interpretation of duplicate
parameters, your CGI and that other component are actually seeing
two different requests. This is a classic way to get security
vulnerabilities.
(FieldStorage also has liberal parsing by default, although you
can change this with an init function parameter. Incidentally,
none of the init function parameters are covered in the cgi
documentation; you have to
read help()
or the cgi.py source.)
Broadly speaking, cgi.FieldStorage feels like a product of an earlier age of web programming, one where CGIs were very much a thing and the web was a smaller and ostensibly friendlier place. For a more or less intranet application that only has to deal with friendly input sent from properly programmed browsers, it's still perfectly good and is unlikely to blow up. For general modern Internet usage, well, not so much, even if you're still using CGIs.
(Wandering Thoughts is still a CGI, although with a lot of work involved. So it can be done.)
2022-03-18
Our limited use of Python's cgi
module
The news of the time interval is that Python is going to remove
some standard library modules (via). This
news caught my eye because two of the modules to be removed are
cgi
and its closely
related kin cgitb
.
We have a number of little CGIs in our environment for internal use, and many of
them are written in Python, so I expected to find us using cgi
all over the place. When I actually looked, our usage was much
lower than I expected, except for one thing.
Some of our CGIs are purely informational; they present some dynamic
information on a web page, and don't take any parameters or otherwise
particularly interact with people. These CGIs tend to use cgitb
so that if they have bugs, we have some hope of catching things.
When these CGIs were written, cgitb
was the easy way to do
something, but these days I would log tracebacks to syslog using
my good way to format them.
(It will probably surprise no one that in the twelve years since I
wrote that entry, none of our internal CGIs were
changed away from using cgitb
. Inertia is an extremely powerful
force.)
Others of our CGIs are interactive, such as the CGIs we use for
our self-serve network access registration systems. These CGIs need to extract
information from submitted forms, so of course they use the
ever-popular cgi.FieldStorage
class. As far as I know there is
and will be no standard library replacement for this, so in theory
we will have to do something here. Since we don't want file uploads,
it actually isn't that much work to read and parse a standard POST
body, or we could just keep our own copy of cgi.py
and use it in
perpetuity.
(The real answer is that all of these CGIs are still Python 2 and are probably going to stay that way, with them running under PyPy if it becomes necessary because Ubuntu removes Python 2 entirely someday.)
PS: DWiki, the pile of Python that is rendering Wandering Thoughts for you to read, has its own code to handle GET
parameters
and POST
forms, which is why I know that doing that isn't too
much work. A very long time ago DWiki did use cgi.FieldStorage
and I had some problems as a result, but that
got entirely rewritten when I moved DWiki to being based on WSGI.
2022-03-02
A Python program can be outside of a virtual environment it uses
A while ago I wrote about installing modules to a custom location, and in that entry one reason I said for not
doing this with a virtual environment was that I didn't
want to put the program involved into a virtual environment just
to use some Python modules. Recently I realized that you don't have
to, because of how virtual environments add themselves to sys.path
. As long as you run your program using the
virtual environment's Python, it gets to use all the modules you
installed in the venv. It doesn't matter where the program is
and you don't have to move it from its current location, you just
have to change what 'python
' it uses.
The full extended version of this is that if you have your program
set up to run using '#!/usr/bin/env python3
', you can change what
Python and thus what virtual environment you use simply by changing
the $PATH
that it uses. The downside of this is that you can
accidentally use a different Python than you intended because your
$PATH
isn't set up the way you thought it was, although in many
cases this will result in immediate and visible problems because
some modules you expected aren't there.
(One way this might happen is if you run the program using the
system Python because you're starting it with a default $PATH
.
One classical way this can happen is running things from crontab
entries.)
Another possible use for this, especially in the $PATH
based version,
is assembling a new virtual environment with new, updated versions of
the modules you use in order to test your existing program with them.
You can also use this to switch module versions back and forth in live
usage just by changing the $PATH
your program runs with (or by
repeatedly editing its #!
line, but that's more work).
Realizing this makes me much more likely in the future to just use
virtual environments for third party modules. The one remaining
irritation is that the virtual environment is specific to the
Python version, but there are various ways
of dealing with that. This is one of the cases where I think we're
going to want to use 'pip freeze
' (in advance) and then exactly
reproduce our previous install in a new virtual environment. Or
maybe we can get 'python3 -m venv --upgrade <venv-dir>
' to work,
although I'm not going to hold my breath on that one.
(A quick test suggests that upgrading the virtual environment doesn't work, at least for going from the Ubuntu 18.04 LTS Python 3 to the Ubuntu 20.04 LTS Python 3. This is more or less what I expected, given what would be involved, so building a new virtual environment from scratch it is. I can't say I'm particularly happy with this limitation of virtual environments, especially given that we always have at least two versions of Python 3 around because we always have two versions of Ubuntu LTS in service.)