My avoidance of Python global variables

August 30, 2010

I spent part of today writing a quick one-off data conversion program. The core of it was a function that filtered items from a list through a number of things in order to sort them into the right category. Once the dust settled on all of the sorting needed, the function had quite a lot of stock arguments, things that didn't vary from call to call in my program. In fact, an unwieldy number of them.

There are at least three vaguely Pythonic options for how to deal with this (plus how I actually did), but what interests me in retrospect is the one answer that I didn't even think about. Namely, global variables.

There are all sorts of reasons to avoid global variables in general, but this was a one-off program and if I'm being honest, that's what all of those stock parameters really were. I was making them local variables in the calling function and then passing them in to the classifying function not so much because it was a good idea but because that's what I do in Python. I just don't use global variables very much even when they'd arguably make sense, and when I do use them I feel irritated.

As best I can tell, what does it is the pesky global keyword. Having to declare variables global any time I want to rebind them adds just enough extra friction to using global variables in practice that I would rather not bother and instead pass lots of things around as parameters. I generally resort to global variables only when passing the same information as parameters would add arguments to too many layers of function calls.

(This is the situation where you have four or five layers of function calls and some of the stuff down at the leaves wants to gather some expensive piece of information only once. The nominally logical thing to do is to call the 'gather information' function once at the start of your program and then pass the parameter all the way down to the leaves, but that means you have to pass the information object through all of the intermediate layers, where all it does is clutter up parameter lists. Really, you want to put it in a global variable, especially if you have several different clusters of these functions that want different chunks of information; passing the information they need down as parameters doesn't scale.)

Part of the friction is the annoyance of the extra line in any function that will rebind the global variable. But another part is just having to think about it at all, partly because I sort of consider global to be a wart (especially because I know what the bytecode is doing behind the scenes).

(Global's not really a wart, but that's another entry.)

Sidebar: the three options that I am thinking of

The three Python options that immediately come to mind are:

  • embed the classifying function into its caller as a closure, giving it direct access to all of what used to be the stock parameters. This feels like a hack to me, and I don't like the extra level of indentation.

  • make the classifying function a method on a class which otherwise had all of the stock parameters as instance variables. It's probably the classical solution but it feels completely artificial to me.

  • make a structure to hold all of the stock parameters, then just pass the structure instead of all of the parameters separately.

Since this was a quick hack, I was lazy and did the poor man's structure: I made a tuple with all of the stock parameters and just passed in the tuple (and then unpacked it in the classifying function). This is less aesthetically pleasing than a structure, but also less code, and it is the obvious next step when one's parameter list spirals out of control and most of it is the same from call to call.

(My eventual code had two arguments that varied from call to call and six that were the same, packed into a tuple. I'm sure that this is a code smell, but it was a quick hack.)


Comments on this page:

From 77.22.218.57 at 2010-08-31 09:53:53:

Another pythonic way for the case of expensive information-gathering which is needed by several (leaf-)functions is to put a memoization decorator around the information-gathering function, so that the results are only calculated on the first access.

From 58.171.140.142 at 2010-08-31 12:32:12:

There's some discussion about this at Moving data around your code - different approaches, via Planet Debian.

James

From 64.101.44.136 at 2010-09-01 14:45:35:

If the variable is only bound once and not changed later, it isn't stateful, and so doesn't have the problems associated with global variables.

For instance, if you bind some environment variables or command line arguments at global scope to variables, but don't allow mutation of them later, that's fine. They are effectively constants.

What's bad is when you have a piece of state that could be mutated by any line of code in your program. The point of most kinds of abstraction, if you boil it down, is to reduce the amount of code where a value could be mutated to an easily verifiable subset.

From 216.131.118.57 at 2010-11-14 04:51:35:

`global' irritates me, too. I'd (1) pass in a dictionary, or (2) use default arguments (if the argument type is immutable), or (3) decorate the function with caching (if the argument type is mutable. in this case default argument irritates me). ['jihan917<at>yahoo<dot>com'.replace('<at>', '@').replace('<dot>', '.')]

Written on 30 August 2010.
« A Bourne shell irritation: no wildcard matching operator
I don't understand how net.ipv4.conf.*.rp_filter can work »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Mon Aug 30 22:47:07 2010
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.