My sysadmin view of Python virtualenvs

April 28, 2013

It all started with a tweet from Matt Simmons:

Dear #Python devs: I'm reading this: (link) - How are virtualenvs not a security nightmare?

There are certainly many things that can go wrong with virtualenvs, but there are also many things that can go wrong with servers and OS packages (as I tweeted, you can have an obscure one-off server just as easily as you can have an obscure one-off virtualenv). My views on this are that there are both drawbacks and advantages to virtualenvs and to lesser solutions (like installing your own copies of packages outside of the system Python area).

There are three drawbacks of virtualenvs and similar setups. First and foremost, you (the person building the virtualenv) have just become not a sysadmin but an OS distribution vendor in that it is now your job to track security issues and bugs in everything in use in the virtualenv, from the version of Python on up. If you are not plugged into all of these, Matt Simmons is correct and your virtualenv may be a ticking time bomb of security issues.

The second drawback is common to anything that installs packages outside of the standard packaging system; it is the lack of system-wide visibility into what packages (and what versions of them) are installed and in use on the system. If someone hears that there is an important issue with version X of package Y, having a horde of virtualenvs means that there is no simple way to answer the question of 'are we running that?' Relatedly is the issue that you can't just update everyone at once by installing a system package update.

(It follows from these two issues that developers absolutely cannot just bundle up a virtualenv, throw it over the wall to operations, and then forget about it. If you do that you're begging for bad problems down the line.)

The final issue is that if you depend on virtualenvs you may run into problems integrating your software into environments that basically must use the system version of Python. One example is if you develop in a virtualenv and then decide that you want to deploy with Apache's mod_wsgi (perhaps because it is unexpectedly good). Presumably if you start down the virtualenv path you've already thought about this.

Set against this are two significant advantages. The first advantage is that you get the version of everything that you want without having to fight against the system package management system (which leads to serious problems). This is especially useful if you're using one of the OS distributions with long term support, which in practice means that they have obsolete versions of pretty much everything. The second advantage is that you are not at risk of a package update from your OS distribution blowing up your applications. How much of a real risk you consider this depends on how much trust you place in your OS distribution vendor and what sort of changes they tend to make. Some OSes will happily do major package version changes as the 'simplest' way to fix security issues (or just because a new major version came out and should be compatible); some are much more conservative. With virtualenvs you're isolated from this and you can also take a selective, per-application approach to updates, where some applications are okay with the new version (or are sufficiently unimportant that you'll take the risk) and other applications need to be handled very carefully with a lot of testing.

(I haven't used a full-blown virtualenv, but our single Django app uses a private version of Django because the version of Ubuntu LTS we originally deployed it on had a too-old system version. And yes, tracking Django security updates and so on is kind of a pain.)

Written on 28 April 2013.
« Some theories on why DNSBLs may be dwindling away
My view of ARM versus other RISCs »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Apr 28 00:11:07 2013
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.