2020-07-20
An exploration of why Python doesn't require a 'main' function
Many languages start running your program by calling a function of
yours that must have a specific name. In C (and many C derived
languages), this is just called main()
; in Go, it's main.main()
(the main()
function in the main
package). Python famously
doesn't require any such function, and won't automatically call a
function called main()
even if you create it. Recently I read
Why doesn’t Python have a main function?
(via),
which puts forward one discussion for why this is so. However, I have
a somewhat different way of explaining this situation.
The core reason that Python doesn't require a main()
function is a
combination of its execution model (specifically for what happens when
you import something) and that under normal circumstances you start
Python programs by (implicitly) importing a single file of Python code.
So let's look at each of these parts.
In many languages things like functions, classes, and so on are
created (defined) by the interpreter or compiler as it parses the
source file. In Python, this is not quite the case; instead, def
and class
are executable statements,
and they define classes and functions when they execute (among other
things, this is part of why metaclasses work).
When Python imports something, it simply executes everything in the
file (or the import more generally). When what's executed is def
and class
statements, you get functions and classes. When what's
executed is regular code, you get more complicated things happening,
including conditional imports or calling functions on the fly under
the right conditions. Or you can write an entire program that just
runs inline, as the file is imported.
(This has some interesting consequences, including what reloading a Python module really does.)
However, Python is not quite as unique here as it might look. Many
languages have some facility to run arbitrary code early on as the
program is 'loading', before the program starts normal execution
(Go has init()
functions, for example). Where Python is different
from these languages is that Python normally starts a program by
loading and executing a specific single file. Because Python is
only executing a single file, it's unambiguous what code is run in
what order and it's straightforward for the code in that file to
control what happens. In a sense, rather than picking an arbitrarily
named function for where execution (nominally) starts, Python is
able to sneakily pick an arbitrarily named file by having you provide
it.
(Compiled languages traditionally have a model where code from a bunch
of separate files is all sort of piled up together. In Python, you can't
really aggregate multiple files together into a shared namespace this
way; one way or another, you have to import
them and everything starts
from some initial file.)
Where this nice model breaks down and needs a workaround is if you
run a package with 'python -m ...
', where Python doesn't really
have a single file that you're executing (or it'd have to make
__init__.py
serve double duty). As covered in the official
documentation's __main__
— Top-level script environment (via),
Python adopts the arbitrary convention of loading a __main__.py
file from your package and declaring it more or less the point where
execution starts.
(Under at least some situations, your package's __init__.py
may
also be executed.)
PS: contrary to the original article's views,
I strongly suggest that you have a main()
function, because
there are significant benefits to keeping your program importable.