Python can execute zip files

October 6, 2012

One of my long-running little bits of unhappiness is that Python strongly encourages modular programming but makes it awkward to write little programs in a modular way. Modules have to be separate files and once you have multiple files you have two problems; the main program has to be able to find those modules to load them, and you have to distribute multiple files and install them somehow instead of just giving people a self-contained file and telling them 'run this'. I recently found that there is a (hacky) way around this, although it's probably not news to people who are more plugged into Python distribution issues than I am.

The first trick is that Python can 'run' directories. If you have a directory with a file called __main__.py and you do python <directory>, Python will run __main__.py. Note that it does so directly, without importing the module; this has various awkward consequences. It will also do something similar to this with 'python -m <module>', but there the module must be on your Python search path and it will be imported before <module>/__main__.py is executed.

The second trick is that Python will import things (ie load code) from zipfiles, basically treating them as (encoded) directories; the exact specifics of this are beyond the scope of this entry (see eg here). As an extension of the first trick, Python will 'run' zipfiles as if they were directories; if you do 'python foo.zip' and foo.zip contains __main__.py, it gets run.

The third trick is that Python is smart enough to do this even when the 'zipfile' has a '#! ....' line at the start. In fact Python is willing to accept quite a lot of things before the actual zipfile; experimentally, it will skip lines that start with '#', blank lines, and lines that only have whitespace. In other words, you can take a zipfile that's got your __main__.py plus associated support modules and put a #!... line on the front to make it a standalone script (at least on Unix).

Since Python supports it, I strongly suggest also adding a second line with a '#' comment explaining what this peculiar thing is. That way people who try to look at your Python program won't get completely confused. Additional information is optional but possibly useful.

(I believe that all of this has been in Python for some time. I've just been slow to discover it, although I vaguely knew that Python could import code from zipfiles.)

Sidebar: zipfiles and byte-compilation

First off, as always (C)Python will only load .pyc precompiled bytecode files when (and if) you import modules. Your __main__.py will not have any bytecode version loaded so you want to make it as small as possible. Second, Python doesn't modify a zipfile when it imports code from it, which means that if you don't include .pyc files in your zipfile CPython will compile all your code to bytecode every time your program is run.

The solution is straightforward: run your program from its directory once (with some do-nothing arguments) before packing everything into a zipfile.

Note that this makes zipfiles somewhat less generic than you might like. CPython bytecode is specific to (roughly) the Python version, so eg Python 2.7 will not load bytecode generated by Python 2.6 and vice versa. Your zipfile program may run unchanged on both, but one may have a startup delay.

Written on 06 October 2012.
« How averages mislead you
Acclimatization makes competition in web search engines hard(er) »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sat Oct 6 23:44:08 2012
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.