Wandering Thoughts archives

2016-03-16

How 'from module import ...' is not doing what you may expect

There are a number of reasons to avoid things like 'from module import *'; for instance, it can be confusing later on and you can import more than you expect. But if you're doing this in the context of, say, just splitting a big source file apart it's tempting to say that these are not really problems. You're not going to be confused about where things come from because you're only importing everything from your own source files (and you're not even thinking of them as modules), and it's perfectly okay for there to be namespace contamination because that's kind of the point. But even then there are traps, because 'from module import ...' is not really doing what you might think it's doing.

There's two possible misconceptions here. If you're doing 'from module import *' within your own code, often what you want is for there to be one conjoined namespace where everything lives, both stuff from the other 'module' (really just a file) and stuff from your 'module' (the current file). If you're doing 'from module import A', it's easy (and tempting) to think that when you write plain A in your code, Python is basically automatically rewriting it to really be 'module.A' for you. Neither is what is actually going on in Python, although things can often look like it.

What a 'from module' import really does is it copies things from one module namespace into another. More specifically it copies the current bindings of names. You can think of 'from module import *' as doing something roughly like this:

import module
_this = globals()
for n in dir(module):
    _this[n] = getattr(module, n)

del _this
del module

(This code does not avoid internal names, doesn't respect __all__, and so on. It's a conceptual illustration.)

There are still two completely separate module namespaces, yours and the namespace of module; you have just copied a bunch of things from the module namespace into yours under the same name (or just some things, if you're doing 'from module import A, B'). Functions and classes from module are still using their module namespace, even if a reference to some or all of them has been copied into your module.

(As a corollary to this, things from module mostly can't refer to anything from your module namespace. This is easy to see since you can't have circular imports; if you're importing module to get at its namespace, it can't be importing you to get at yours. (Yes, there are odd ways around this.))

One reason why this matter is that if functions or classes from module update stuff in their module namespace, you may or may not pick it up in your own module. For example, consider the following code in some other module:

gvar = 10
func setit(newval):
  global gvar
  gvar = newval

The gvar that you see in your own module will forever be '10', no matter what calls to setit() have been made. However, code in the other module will see a different value for gvar.

Not all sorts of updates will do this, of course. If gvar is a dictionary and code just adds, changes, and deletes keys in it, everyone will see the same gvar. The illusion of a shared namespace can hold up, but it is ultimately only an illusion and it can be fragile. (And unless you already know Python well, it isn't necessarily easy to see where and when it's going to break down.)

Sidebar: An additional bit of possible weirdness

There are some situations where a module's namespace is more or less overwritten wholesale; the obvious case is reload() of the module. If you reload() a module that has been the subject of 'from module import ...', all of those bare imports are now broken, or at least not updated themselves. You can get into very odd situations this way (especially considering what reloading a module really does).

python/FromImportBindingIssue written at 23:59:45; Add Comment

I wish I could split up code more easily in Python

This really starts with some tweets:

This Python program has grown to almost 1500 lines. I think I need an intervention, or better data structures, or something.
I also wish it was easier and more convenient to split up a Python program across multiple source files (it's one way Go wins).

The best way to split up a big program is to genuinely modularize it. In other words, find separate pieces of functionality that can be cleanly extracted and turn them into Python modules, in separate files. There are still issues with your main program actually finding the modules, but this can be worked around (even though it is and remains annoying).

However, this assumes that you have a modular structure to start with, with things sensibly separated. If your program started off as a little 200 line thing and then grew step by step into a 1500 line monster (especially iteratively), you may not necessarily have this. That's where Python makes things a little bit awkward. Splitting things up into separate files fundamentally puts them in separate modules and thus separate namespaces; in order to do it, you need to be able to pull your code apart in this way. If your code isn't in this state already you have some degree of rewriting ahead of you, and in the mean time you have a 1500 line Python file.

(In theory you can do 'from modname import *'. In practice this is only faking a single namespace and the fakery can break down in various ways.)

Go may be less elegant here (and Go certainly makes it harder to have separate namespaces), but you can slice a big source file up into several separate ones while keeping them all co-mingled as one module, all using bits and pieces from each other. Sometimes this is more convenient and expedient, even if it may be uglier.

With that said, Python has excellent reasons to require every separate file to be a separate module. To summarize very quickly, it's tied to how you don't just load a file of Python source code, you run it (with things like function and class definitions actually being executable statements, and possibly other interesting things happening). This is a straightforward model that's quite appropriate for an interpreted language, but it imposes certain constraints.

python/SplittingProgramProblems written at 01:43:45; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.