Why your main program should be importable
When I first started coding in Python, I didn't know what I was doing.
So I structured my Python programs the way I would write Bourne shell
scripts or Perl programs, writing functions as necessary and useful
but otherwise putting all of the logic and code in the program's file
outside of functions (in what I now call 'module scope').
This is a perfectly rational structure for Python programs, and even
works; my programs ran fine and were perfectly functional. But it was
also a bad mistake, as I slowly discovered later; what you really want
to do is put all of your code in functions (and then start one with
magic).
The problem that makes it a mistake is that a program written this way
cannot be imported as if it was just another Python module; if you
try, the program's code immediately starts running and explosive things
start happening. There are at least two reasons why this is unfortunate:
- various useful tools like pychecker
rely on
importing your code in order to pick through it. This is
arguably a mistake on pychecker's part and they should be using a
more robust mechanism, but it's how they work right now, so if you want
to use them (and pychecker is usually quite useful) you have to live
with it.
(Discovering pychecker and trying to use it on my programs was how
I began to realize the mistake I'd made.)
- being able to
import your main program gives you a handy method of
testing bits of it from an interactive interpreter.
To make this really work you need to code your program so that it
calls sys.exit() as little as possible. If a function runs into
a fatal error it should not do the usual 'call die() with an
error message' thing; instead, it should raise an exception. Only
the very top of the program should catch those exceptions and
wind up calling sys.exit().
(And if you don't like phase tracking, catching and
wrapping exceptions can give you a nice method to add context to the
error message that you'll wind up reporting.)
I'm sure that this is strongly suggested somewhere in the Python
documentation and the smart people were aware of it from the
start, but I missed it (to my regret with those early programs).
Oh yes, the magic you need to make your top level function start
running when your program is actually run (instead of being imported)
is:
if __name__ == "__main__":
... run code here ...
At the module scope, __name__ is normally the name of your module
(well, the name it is being imported by). When Python is running your
code because it has been directly handed to the interpreter, Python sets
the name to "__main__" instead.
Sidebar: my current program structure
The program structure that I have wound up adopting for my own
programs looks something like this:
import sys
def process(...):
.....
def main(args):
.....
try:
process(...)
except EnvironmentError, e:
die("OS problem: "+str(e))
except MyError, e:
die(str(e))
....
if __name__ == "__main__":
main(sys.argv[1:])
The main() function parses the arguments, loads configuration files,
and so on, and then calls process() with whatever arguments are
appropriate for the program; process() actually starts to do work. To
put it one way, main() does all the stuff that only has to be done
when the program is being run as an actual program.