2013-05-11
The consequences of importing a module twice
Back when I wrote about Python's relative import problem, I mentioned that only actually importing a module once can be important due to Python's semantics. Today I feel like discussing what these are and how much they can matter.
The straightforward thing that goes wrong if you manage to import a module twice (under two different names) is that any code in the module gets run twice, not once. Modules that run active code on import assume that this code is only going to be run once; running it again may result in various sorts of malfunctions.
At one level, modules that run code on import are relatively rare
because people understand it's bad form for a simple import to have
big side effects. At another level, various frameworks like Django
effectively run code on module import in order to handle things like
setting up models and view forms and so on; it's just that this
code isn't directly visible in your module because it's hiding in
framework metaclasses. But this issue is a signpost to the really big
thing: function and class definitions are executable statements that are run at import time. The net effect
is that when you import a module a second time the new import has a
completely distinct set of functions, classes, exceptions, sentinel
objects, and so on. They look identical to the versions from the first
import but as far as Python is concerned they are completely distinct;
fred.MyCls is not the same thing as mymod.fred.MyCls.
(This is the same effect that you get when you use reload() on a
module.)
However, my guess is that this generally won't matter. Most Python code uses duck typing and the two distinct classes are identical as far as that goes. Use of things like specific exceptions, sentinel values, and imported classes is probably going to be confined to the modules that directly imported the dual-imported module and thus mostly hidden from the outside world (for example, it's usually considered bad manners to leak exceptions from a module that you imported into the outside world). In many cases even the objects from the imported module are going to be significantly confined to the importing module.
(One potentially bad thing is that if the module has an internal cache of some sort, you will get two copies of the cache and thus perhaps twice the memory use.)
2013-05-07
Python's relative import problem
Back in this entry I bemoaned the fact that
Python's syntax for relative imports ('from . import fred') is only
valid inside modules. The reason to have it valid outside modules is
fairly straightforward; it would allow you to import and run the same
Python code whether or not you were doing 'import module.thing' from
outside the module's directory or sitting inside the module's directory
doing 'import thing'. The way things are in Python today, once you
start using relative imports in your code it can only be used as a
module (which has implications for it being somehow on your Python path
and so on even while you're coding).
Unfortunately for me, I suspect that this restriction is not arbitrary. The problem that Python is probably worrying about is importing the same submodule twice under different names. The official Python semantics are that there is only one copy of a particular (sub)module and its module level code is run only once, even if the module is imported multiple times; imports after the first one simply return a cached reference.
(These semantics are important in a number of situations that may not be obvious, due to Python's execution model.)
However, Python has opted to do this based on the apparent (full) module name, not based on (say) remembering the file that a particular module was loaded from and not reloading the file. When you do a relative import inside a module, Python knows the full name of the new submodule you're importing (because it knows the full, module-included name of the code doing the relative import). When you do a relative import outside a module, Python has no such knowledge but it knows that in theory this code is part of a module. This opens up the possibility of double-importing a submodule (once under its full name and once under whatever magic name you make up for a non-module relative import). Python opts to be safe and block this by refusing to do a relative import unless it can (reliably) work out the absolute name.
(There are still plenty of ways to import a module twice but they all require you to actively do something bad, like add both a directory and one of its subdirectories to your Python path. Sadly this is quite easy because Python will automatically add things to the Python path for you under some common circumstances.)