2005-10-28
Inside building RPMs with Python distutils
For reasons outlined in CPANProblem, I don't install
Python packages using their setup tools; I build them into RPMs and
install the RPMs. Python's distutils (the setup system used by most
Python packages) have a command to do this, called bdist_rpm, but
it has some issues that have caused me to become quite familiar with
its inner workings.
bdist_rpm works in the following steps:
- use the distutils
sdistcommand to create a source tarball. - write an RPM specfile.
- create a local RPM build tree and put the specfile and tarball into it.
- run
rpmbuildin the local RPM build tree to create the source or binary RPM.
My experience is that the simpler the package and the package's
setup.py, the more likely that everything will go fine. Otherwise,
you can run into a variety of problems.
Because of step #1, bdist_rpm is only really suitable for building
binary RPMs for internal usage. If you want to build public RPMs you
should use bdist_rpm only to generate the specfile, and put this
together with the package's real source distribution yourself. Using
bdist_rpm just to get the specfile is also the 'least moving
parts' option if it's giving you a lot of trouble.
Distutils has three ways of figuring out what to put in the step #1 source tarball, documented in more detail in the creating a source distribution section of the Distributing Python Modules documentation. They are, in order of preference:
- a
MANIFESTfile that lists all files that should be included, or - a
MANIFEST.infile that gives general directions (often also using stuff fromsetup.py), or - reverse-engineering the list of sources from
setup.py.
If this step is omitting files you want, the big hammer is to just
write a MANIFEST file (or modify the one sdist leaves sitting
around).
The RPM specfile created in step #2 automatically picks up and
packages everything that a normal installation of the package
creates. At most, you may find that it's omitting extra documentation
or examples or the like that you want packaged up. (It's supposed to
automatically package up README files as %doc files, but this
doesn't always work.)
Steps #3 and #4 assume a standard RPM subdirectory
layout. Unfortunately this explodes if you have used a
$HOME/.rpmmacros file to rearrange things more cleanly. The easiest
workaround is to use something like
'HOME=/ python setup.py bdist_rpm', so that it doesn't consult
your .rpmmacros file.
2005-10-22
A gotcha with Python and Unix signals
Python likes to handle things through exceptions. As part of this, on
Unix it does two important signal changes; it ignores SIGPIPE and
catches SIGINT. Each of these can make a Python program report
apparent errors where a normal Unix command-line program would just
silently exit.
This matters if you want to write a Python program that will play well
in an ordinary command-line environment, alongside things like cat,
sed, and awk.
First, people expect that they can ^C a Unix command-line program and
have it just quietly stop. Python's default behavior turns this into a
KeyboardInterrupt exception, which your program is probably not
catching; the user will get a multi-line traceback.
Second and more important, Python ignoring SIGPIPE means that
your program will get an OSError exception if it writes to a pipe
that has closed. Pipes close all the time in Unix command pipelines
when you write things like:
generate | mangle.py | head -10
Since head exits after it's read and printed ten lines, further
output from mangle.py is probably going to get an OSError. If you
didn't handle it (do you guard print statements with trys?), the
person running this will see a traceback on standard error. People
tend to get irritated when their clean output is messed up with
'error' messages.
(head is not the only program that will do this, and it doesn't
necessarily happen all the time. Consider what happens when you feed
the output to a pager and quit after seeing the first screen.)
The technique I use for this is:
from signal import signal, \ SIGPIPE, SIGINT, SIG_DFL, \ default_int_handler signal(SIGPIPE, SIG_DFL) s = signal(SIGINT, SIG_DFL) if s != default_int_handler: signal(SIGINT, s)
Checking what SIGINT is set to is necessary because when your Python
program is being run via nohup and similar things, SIGINT will be
set to SIG_IGN. If we always set SIGINT to SIG_DFL, we would
defeat nohup and irritate the user even more.
(This little thing with SIGINT is not unique to Python; it's something
you should watch out for in any program where you're setting a SIGINT
handler explicitly. Python itself does it the right way on startup,
leaving a SIG_IGN setting alone.)
2005-10-20
A Python surprise: exiting is an exception
Once upon a time I wrote a program to scan incoming mail messages during the SMTP conversation for signs of spam. Because this was running as part of our mailer, reliability was very important; unhandled errors could cause us to lose mail.
As part of reliability, I decided I wanted to catch any unhandled exception (which would normally abort the program with a backtrace), log the backtrace, save a copy of the message that caused the bug, and so on. So I wrote code that went like this:
def catchall(M, routine):
try:
routine(M)
except:
# log backtrace, etc
The mail checking routine told the mailer whether to accept or reject
the message through the program's exit status code, so the routine
calls sys.exit() when it's done.
Then I actually ran this code, and the first time through it dutifully
spewed a backtrace into the logs about the code raising an unhandled
exception called SystemExit.
Naievely (without paying attention to the documentation for the
sys module) I had
expected sys.exit() to mostly just call the C library exit()
function. As the sys module documentation makes clear, this is not
what happens: sys.exit() raises SystemExit, the whole chain of
exception handling happens (in part so that finally clauses get
executed), and at the top of the interpreter you finally exit.
Unless, of course, you accidentally catch SystemExit. Then very odd
things can happen.
(This is another example of why
broad excepts are dangerous.)
Accidentally shooting yourself in the foot in Python
Recently, I stumbled over a small issue in Python's cgi module that is a good illustration of how unintended consequences in Python can wind up shooting you in the foot.
The cgi module's main purpose is to create a dictionary-like
object that contains all of the parameters passed to your CGI program
in the GET or POST HTTP command. DWiki uses it roughly like this:
form = cgi.FieldStorage() for k in form.keys(): ... stuff ...
Then one day a cracker tried an XML-RPC based exploit against DWiki and
this code blew up, getting a TypeError from the form.keys() call.
This is at least reasonable, because an XML-RPC POST is completely
different than a form POST and doesn't actually have any form
parameters. (TypeError is a bit strong, but it did ensure that DWiki
paid attention.)
No problem; I could just guard the form.keys() call with an 'if not
form: return'. Except that the 'not form' got the same TypeError.
Which is startling, because you don't normally expect 'not obj' to
throw an error.
This surprising behavior of the cgi module happens through three steps. First, Python decides whether objects are True or False like this:
- if there is a
__nonzero__method, call that. - if there is a
__len__method, a zero length is False and otherwise you're True (because Python usefully makes collections false if they're empty and true if they contain something). - if there is neither, you're always True.
As a dictionary-like thing, FieldStorage defines a __len__ method
in the obvious way:
def __len__(self): return len(self.keys())
Finally, FieldStorage decided to let instances represent several
different things and that calling .keys() on an instance that wasn't
dealing with form parameters should throw TypeError. (This is more
sensible than this description may make it sound.)
Apart from a practical illustration of unintended consequences and complex interactions, what I've taken away from this is to remember than __len__ on objects is used for more than just the len() function. (Other special methods also have multiple uses.)
Sidebar: so how did I solve this?
My solution was lame:
try: form.keys() except TypeError: return
I suspect that the correct solution is to check form.type to make
sure that the POST came in with a Content-Type header of
'application/x-www-form-urlencoded'. (Except I don't know enough to
know if all POSTs to DWiki will always arrive like that. Ah, HTTP,
we love you so.)
2005-10-03
Some important notes on getting all objects in Python
It turns out that I'm wrong about several things I mentioned in GetAllObjects, although the code there is still useful and as correct as you can reasonably get. However, it does have a few limitations and may miss objects under some circumstances.
First, gc.get_objects actually returns all container objects.
In specific, it returns all objects that can participate in reference
cycles; this necessarily includes all container objects (dicts, tuples,
and lists), but also include other types as well. (My code that seemed
to say otherwise was in error; I didn't do a proper breadth-first
traversal of the list.)
Second, it's possible that expanding gc.get_objects may not get
all objects. The main way this can happen is that gc.get_objects
can't see objects that are only referred to from C code, for example if
a compiled extension module is holding on to an object for later use
without creating a visible name binding. (One example of this is the
signal module, which
holds an internal reference to any function set as a signal handler.)
If you need a completely accurate count, you need to use a debug build
of Python. This keeps an internal list of all live dynamically allocated
Python objects and makes it available via some additional functions in
the sys module. (Naturally this slows the interpreter down and makes
it use more memory.)
Even this has an omission: it lists only 'heap' objects, those that
have been dynamically allocated. Python has a certain number of 'static'
objects, such as type objects in the C code (instead of being created,
their names just get registered with the Python interpreter). There
are also static plain objects, for example True, False, and None.
However, many of these static objects will appear on the expanded
gc.get_objects list. This is because they are referred to by live
objects and gc.get_referents is happy to include them in its
results. (This may not be too useful for object usage counting, since
you can't get rid of static objects anyways.)
I owe a debt of thanks to Martin v. Löwis, who graciously took the time to correct my misconceptions and errors, and explain things to me. (Any remaining errors are of course my fault.)
(The charm of blogging is that I get to make mistakes like this in public. On the upside, I now know a bunch more about the insides of the CPython implementation than I used to.)