Python virtual environments transparently add themselves to sys.path

September 19, 2020

I have recently been exploring some aspects of Python's virtual environments, so I thought I had a reasonable understanding of how they worked. In the HN discussion of my entry on installing modules to a custom location, I saw someone say that you could just run the virtual environment's python binary (ie, <directory>/bin/python, which is normally a symlink to the system Python) without 'activating' the virtual environment and it would still find the modules that you'd installed in the venv. This surprised me, because I had expected that you'd have to set some sort of environment variable to get Python to add your arbitrary non-standard location to sys.path. However, testing it showed that it really works, which is both convenient and magical.

(For me as a sysadmin, the important thing is that we can run programs that use a virtual environment without having to set any environment variables before hand. As long as we use the venv's Python, everything works, presumably even for things run from Unix cron or started as daemons. And we can arrange to always do that by setting '#!...' paths appropriately.)

At first I guessed that this was something that Python did in general if it found an appropriate directory tree around where you were running the python executable from. However, it turns out not to be quite this general. Although Python has a quite intricate process on Unix for finding its standard library and site-packages (see the comment at the start of Modules/getpath.c), it doesn't go so far as to add random trees just because they sort of look right. Instead, there is a special feature for Python virtual environments that looks for pyvenv.cfg and uses it to trigger some additional things. To quote from the source code comment:

Search for an "pyvenv.cfg" environment configuration file, first in the executable's directory and then in the parent directory. If found, open it for use when searching for prefixes.

(I haven't attempted to trace through the tangled code to determine exactly how this results in your venv's site-packages getting added to sys.path.)

Venv normally writes pyvenv.cfg to the root of your virtual environment directory tree (ie, in the parent of the executable's directory). For me the contents appear to be pretty generic; there's no mention of the virtual directory's location, and copying the pyvenv.cfg to the root of an artificially created minimal tree does cause its site-packages directory to get added to sys.path.

Since all of this is undocumented, it's probably best to consider it special private stuff that's only for the use of the standard venv system. If you want something like a virtual environment, complete with its own site-packages that will be automatically picked up when you run the Python from that directory hierarchy, just create a real virtual environment. They're pretty lightweight.

(Well, for a value of 'pretty lightweight' that amounts to 947 files, directories, and symbolic links and 12 Mbytes of disk space. Almost all of these are from installing pip into your new virtual environment; it drops to 13 files if you use --without-pip.)

Written on 19 September 2020.
« Python 3 venvs don't normally really embed their own copy of Python (on Unix)
Using SPF on HELO/EHLO hostnames is repurposing SPF to validate a different thing »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sat Sep 19 00:28:57 2020
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.