How to get as much of your program byte-compiled as possible

September 8, 2008

The CPython interpreter doesn't run Python code directly, but first compiles it to bytecode. Because this parsing and compiling steps are somewhat time-consuming, CPython optimizes things by saving the compiled bytecodes into .pyc files (or .pyo files if you used the -O switch to python), and trying to use them when possible. This speed increase doesn't necessarily matter all that much for a program that runs a long time, but it does matter for typical utility programs, which only run for a short time and may get executed a lot. (This is especially so if they have a lot of code that is only used occasionally, so that Python has a lot of code to compile that it will never actually run.)

One of the less well known things about CPython (at least on Unix) is that it only loads bytecode files when you import modules. If you just run a file of Python code with 'python file.py' (which is equivalent to a script that starts with '#!/usr/bin/python'), Python does not load file.pyc even if it exists, and it certainly won't create it. So if you have a big program full of Python code, you're paying the cost to byte-compile it each time you run it.

The way around this is obvious: put all of your program's code in a module, and then have your 'program' just be a tiny bit of Python that imports the module and then calls your main entry point. (Of course, this may then expose you to search path issues.)

The one other thing to watch out for is that the user running the Python program may not have enough permissions on the directory with the Python modules to write .pyc files there. If this is the case, you may need to 'pre-compile' the modules by running the script once as a user with enough permissions (or just manually import'ing them in an interpreter).


Comments on this page:

From 220.245.180.138 at 2008-09-08 02:31:18:

Check out compileall.py (in python's lib dir, fe /usr/lib/python2.5)

usage: python compileall.py [-l] [-f] [-q] [-d destdir] [-x regexp] [directory ...] -l: don't recurse down -f: force rebuild even if timestamps are up-to-date -q: quiet operation -d destdir: purported directory name for error messages

  if no directory arguments, -l sys.path is assumed

-x regexp: skip files matching the regular expression regexp

  the regexp is search for in the full path of the file
Written on 08 September 2008.
« Why your main program should be importable
The problem with unit testing programs »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Mon Sep 8 00:18:07 2008
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.