Wandering Thoughts archives

2011-03-31

A slightly unobvious trap with 'from module import *'

If you are already being lazy, it is easy to drift into the habit of doing 'from yourmodule import *' in your own code, especially in situations where you really are going to use everything from your module in the code that you're writing (for example, importing your Django app's models into your view code for it). As it happens, there is a little trap for you waiting here.

When they write 'from <X> import *', most people are probably thinking that it imports everything they've defined in module X into their current module. But this is not quite what it does. What it actually does is that it imports everything in module X's namespace into your namespace. Now, most of what's in module X's namespace is what you defined in it. But the namespace also includes whatever you imported in module X; all of those things have now been silently imported into your namespace too.

(Yes, yes, __all__ can cut this off. I don't think people use __all__ very much for internal modules and internal imports like this, unless they are the sort of person who eschews 'from <X> import *' in the first place.)

At this point it's quite easy to just start using some of these silently imported things in your code without realizing that you haven't explicitly imported them. When dealing with a bunch of directly imported things (eg, 'from what.util import SomeThing') and a bunch of your own code, it's easy to lose track of what you have and haven't imported yet. When you write the code and it works, you're of course going to assume that you explicitly imported whatever you're using, because that's the only way the code could work, right?

Now the problem is that in your code in module Y is quietly depending on the inner workings of module X. If you revise module X so that it no longer needs the SomeThing utility bit and so remove the import, X will still work but suddenly this other module Y breaks. Oops.

(The extreme case of this is forgetting to import an entire module. Then you might wind up with all sorts of code in module Y making free use of, say, 'socket.<whatever>', despite there being nary an 'import socket' to be seen. This is the sort of thing that leaves you starting at the revision history in your version control system and wondering how the code ever worked in the first place.)

A more perverse version of this has intermediate steps; module X is imported wholesale into module Y, which is in turn imported wholesale into module Z, which is where you accidentally use SomeThing without importing it. After this happens, there's a whole raft of changes that can be done that break module Z.

(I'm sure that your imagination can come up with even more extreme and odd scenarios.)

ImportAllImportsAll written at 23:10:09; Add Comment

2011-03-28

Why you should avoid 'from module import whatever'

Every so often I get to re-learn vaguely painful lessons.

For whatever reason, Django has a distinct coding style, as expressed in things like their tutorial documentation. When I wrote my Django application recently I generally followed this style, partly because I took bits and pieces straight from the tutorial (because it was the easy approach). One part of the Django style is a heavy use of 'from module import SomeThing', or even 'from module import *', and I copied this for my own code. Everything worked fine and it certainly was convenient.

Then somewhat later I went back to the code and found myself staring at a section of it, wondering just where a particular function came from. Fortunately, context and naming made it fairly obvious that it wasn't a standard Django function, but I had to do a file search to determine whether it was from my views.py or my models.py.

(And I had it easy, since I only had two files that it could have come from.)

In a nutshell, this is why namespace-less imports are bad: they hide where names come from, stripping them of context. Context is a good thing, because us fallible humans can only hold so many things in our head; the more explicit context, the less we have to keep track of ourselves. Even partial module names helps (where you do 'from module import dog' and then use 'dog.SomeThing'); if nothing else, it tells you that this particular name doesn't come from the current file and it gives you an idea of where to start looking.

(In the best case the partial module name is both unique and distinctive, so it generally gives you the full context on the spot.)

Some people will object that specifying even a partial module name results in too much typing. My response is that this just means you need shorter names.

(Yes, coming up with good short names are hard. No one said API design was easy.)

PS: I don't particularly fault Django for this particular element of their style; it's consistent and fits their overall goals and does save a certain amount of more or less make-work.

UseModuleNamespaces written at 22:12:20; Add Comment

2011-03-22

How to add and use additional fields on Django model formsets

Suppose that you have a model formset and for some reason you want to add an additional field to each individual form (or perhaps you want to reinterpret a model field into something that is more user-friendly), a field that is of course not in your model schema.

At one level, adding custom fields or custom field handling to a model formset is relatively simple once you know what to do. At another level, the question is how to get access to information about the field. The normal way of dealing with a model formset is:

if formset.is_valid():
  instances = formset.save(commit=False)
  for thing in instances:
    ....

The problem is that thing is a model instance, not a form, and our new field appears only in the form; since it is not in the model schema, Django cannot copy it to the model instance it derives from the form information. When you're using a model form or model formset, the only thing that Django does with fields that are not in the model is validate them and then (effectively) throw them away.

(This can sometimes be useful, for example an 'I agree to these terms' checkbox. If you make this a boolean field and require it, the field and thus the form will not validate until it is ticked.)

If we want access to non-model fields in a model formset, we need to directly iterate the forms of the formset instead of just iterating the model instances. These are available through formset.forms but this has all of the forms in the formset, including ones that haven't been modified or used; we need to exclude them. The way to do this is:

if formset.is_valid():
  for form in formset.forms:
    if not form.has_changed():
      continue
    thing = form.save(commit=False)
    ... process ...

You now have direct access to both the form itself and the corresponding model instance, so you can check the form for your extra fields and do whatever processing you need.

Note that you do not have to specifically check that the form itself is valid. Because the entire formset validated, we know that any changed form is itself valid. Unchanged forms may well not be if this was a formset for entering new data, since they will still be blank.

(This applies to Django 1.2.5, and my disclaimer is that I am new to Django so this may well not be considered the Django-correct way to do this particular thing.)

PS: as far as I can see, you do not want to use formset.cleaned_data here. Although it exists, it's the basic form data with no clear way to turn it into a model instance and it still includes all of the unchanged or blank forms in the formset.

Sidebar: the actual problem in concrete

What I've written here sounds very abstract and you might be wondering why anyone would want to do something like this, so lets make it concrete with my Django application, our account request management system.

Account requests normally have to be approved by their sponsor; however, staff can approve requests on behalf of the sponsor. Staff can also enter a bunch of new account requests, which are normally not pre-approved and need the sponsor's approval. Suppose that we want to add a checkbox to the form to say 'mark this request as approved when it gets created', to save staff from the annoyance of creating a bunch of new requests and then immediately going off to approve them all.

This checkbox is not a model schema field directly (although ticking it results in a different value for schema field). I suppose that with a lot of effort we could create some sort of custom widget mapping that turns the 'status of request' model schema field (normally a three way choice) into a boolean tickbox (unticked makes the status 'Pending', ticked makes it 'Approved'), but I rather think that my approach here is simpler.

DjangoModelformsetsMoreFields written at 13:39:17; Add Comment

Some notes on doing things with Django model formsets

Django's model formsets are not well documented, at least not in the Django documentation I've found on their website. Oh, the API docs say more or less what parameters things like modelformset_factory() take, but they won't tell you how you should use them. In particular the documentation I've seen doesn't say very much about how to customize what appears in your form elements and so on.

So here is what I know:

The form argument to modelformset_factory() is used to construct the class for individual form elements. It should inherit from forms.ModelForm like regular customized forms, but unlike regular forms it should not have an internal Meta class; the Meta class (or its equivalent) will be added by the model formset construction process. Customized form classes can alter the default look and behavior of schema fields by defining form fields as usual, and they can also define validation and cleaning functions. Since form field validation is more powerful than schema field validation, you may want to override fields to, eg, make them into forms.RegexField fields with appropriate regular expressions. Or just to improve the labels and error messages.

(Yes, the need for this is a pain in the rear. If you want user friendly validation and error messages, you can wind up overriding nearly the entire set of model fields. Of course this pain exists for ordinary model forms as well.)

The formset argument to modelformset_factory() is used to construct the class for the overall formset. It should inherit from BaseModelFormSet (from django.forms.models). What I have used this for is a clean() method that makes sure that no two newly-created account requests have the same login. I believe that any clean() function you use should start out by calling the superclass clean().

The fields argument to modelformset_factory() is a list (in the broad sense) of what additional fields from the model should be included in the individual forms. Similarly, the exclude argument is the list of what additional fields should be excluded. Note that this is additional fields; if you have a custom form class, any fields it defines explicitly are always included. You do not need to list them in fields, and you cannot make them go away by listing them in exclude. If you need to include custom fields only some of the time, you will need multiple form classes. Yes, this is annoying, especially if you have a lot of variants (and there may be a better way that involves more magic).

(You can sort of see the implementation showing through here.)

For future reference (given that Django changes over time), this is all applicable to Django 1.2.5.

Sidebar: how I find out what fields have changed in edited forms

In a regular form (even a model form) you can inspect form.changed_data to see what fields have been edited. This is awkward to do in a modelformset, because you do not have convenient access to the individual forms that have been changed. How I get around this is the following, somewhat hacky code:

if formset.is_valid():
  instances = formset.save(commit=False)
  cdict = dict(formset.changed_objects)
  for thing in instances:
    changed = cdict[thing]
    ....

(In my application I need to take special action when various fields are modified, plus I like having audit records that say what fields were edited.)

DjangoModelFormsetNotes written at 00:40:10; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.