Why your main program should be importable

September 7, 2008

When I first started coding in Python, I didn't know what I was doing. So I structured my Python programs the way I would write Bourne shell scripts or Perl programs, writing functions as necessary and useful but otherwise putting all of the logic and code in the program's file outside of functions (in what I now call 'module scope').

This is a perfectly rational structure for Python programs, and even works; my programs ran fine and were perfectly functional. But it was also a bad mistake, as I slowly discovered later; what you really want to do is put all of your code in functions (and then start one with magic).

The problem that makes it a mistake is that a program written this way cannot be imported as if it was just another Python module; if you try, the program's code immediately starts running and explosive things start happening. There are at least two reasons why this is unfortunate:

  • various useful tools like pychecker rely on importing your code in order to pick through it. This is arguably a mistake on pychecker's part and they should be using a more robust mechanism, but it's how they work right now, so if you want to use them (and pychecker is usually quite useful) you have to live with it.

    (Discovering pychecker and trying to use it on my programs was how I began to realize the mistake I'd made.)

  • being able to import your main program gives you a handy method of testing bits of it from an interactive interpreter.

    To make this really work you need to code your program so that it calls sys.exit() as little as possible. If a function runs into a fatal error it should not do the usual 'call die() with an error message' thing; instead, it should raise an exception. Only the very top of the program should catch those exceptions and wind up calling sys.exit().

    (And if you don't like phase tracking, catching and wrapping exceptions can give you a nice method to add context to the error message that you'll wind up reporting.)

I'm sure that this is strongly suggested somewhere in the Python documentation and the smart people were aware of it from the start, but I missed it (to my regret with those early programs).

Oh yes, the magic you need to make your top level function start running when your program is actually run (instead of being imported) is:

if __name__ == "__main__":
    ... run code here ...

At the module scope, __name__ is normally the name of your module (well, the name it is being imported by). When Python is running your code because it has been directly handed to the interpreter, Python sets the name to "__main__" instead.

Sidebar: my current program structure

The program structure that I have wound up adopting for my own programs looks something like this:

import sys
def process(...):
    .....

def main(args):
    .....
    try:
        process(...)
    except EnvironmentError, e:
        die("OS problem: "+str(e))
    except MyError, e:
        die(str(e))
    ....

if __name__ == "__main__":
    main(sys.argv[1:])

The main() function parses the arguments, loads configuration files, and so on, and then calls process() with whatever arguments are appropriate for the program; process() actually starts to do work. To put it one way, main() does all the stuff that only has to be done when the program is being run as an actual program.

Written on 07 September 2008.
« Why negative DNS caching is necessary
How to get as much of your program byte-compiled as possible »

Page tools: View Source.
Search:
Login: Password:

Last modified: Sun Sep 7 01:02:07 2008
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.