Wandering Thoughts archives

2011-02-27

Django's forms are security-smart the way you want them to be

In reply to my desire for read-only form fields, a commentator on Reddit wrote (in part):

You don't really save much code by having read only inputs. In the example used (inputs only editable if blank) you would still need server side checking that these values are not modified. [...]

This is incorrect (at least for a competent implementation of read only fields). As a modern framework, Django is smart about this so it never takes the network's word for what's supposed to be in a form.

I'm sure that there are some frameworks that read the submitted form fields from the HTTP POST response and put them all into your model instance (or the object holding form data). Django does not work like that. The only things that will appear in validated form data (and will be saved in model instances) are form fields (and field values) that you told Django to expect. It doesn't matter if your model has a field and the HTTP POST tries to set the field; Django will not pass it through if your form does not include the field (and it's a writeable field, once you can have read-only fields).

You do not have to write any extra code or make any extra checks in your forms processing to do this. It is all handled for you by the Django forms code, the way you want it to be, and everything just works right. All you have to do is not touch anything that Django says doesn't validate (and not read raw POST field data; always use the cleaned data).

Since this is such an obvious and important security-related thing, I expect that all modern web frameworks behave this way. We are no longer in the days when your PHP code slops the form POST data straight into the database; everyone knows better by now.

By the way, in a typical Django application that is generating dynamic forms (with fields sometimes included or excluded), the form that matters for what gets accepted in a HTTP POST is the Django form that is generated when the POST is processed, not the form that was generated when the HTML page and form were generated and shown to the user (which may have happened some time ago if the user sat there for a while). It's possible for these two versions of the form to differ from each other if things have changed in the mean time.

(It's also possible for the underlying record data to have changed, of course, but this is a general issue with all web applications.)

DjangoSmartForms written at 00:04:58; Add Comment

2011-02-22

Why I want read-only form fields in Django

One of the things that Django has is a forms handling system, especially for forms populated from your schema models. One of the commonly requested features for it is the ability to have read-only form fields, things that will be displayed in your form but cannot be changed. So far, the official Django answer is that this is a bad idea and you don't want to do that (see for example here).

(One reason I've read for this view is that Django people consider a form to be by definition for inputing data, not displaying it.)

Well, I disagree with this view. The advantage of well supported read-only fields is uniformity; you can treat all of your form fields exactly the same, whether or not they're read-write or read-only fields. This is especially important if you are changing which fields are read-only and which are read-write on the fly depending on things like user permissions or the state of an object (for example, you might have fields which can only be set if they are initially blank).

Having this uniformity means that you can lay out all fields using a simple '{% for field in form %}' loop (or simply rely on the default form rendering). As I've experienced myself, without this uniformity things get annoying very fast; I've been basically reduced to laying out forms specific field by specific field, with conditional logic about whether some fields are included as form fields or rendered as text from the model instance. Among other downsides, this is very brittle in the face of changes to the form and model fields, since both the view logic and the form template now know about all of them.

(This points out another annoying limitation of Django templates; it's not straightforward to loop over a subset of form or model instance fields. This means that the moment you move away from doing the same thing with all fields, you usually have to completely hand-code it in your template. Or at least there is no clear and obvious way to do this, although it's relatively simple to build generic functions that generate lists to feed to template rendering.)

All of this is multiplied if you are using a model-driven formset instead of a single-record form. With model formsets, it'd be much easier if you could give the formset a list of fields that are included and fields that are read-only. (The Django admin interface actually shows just how much easier this can make life.)

I can get by without read-only form fields. They'd just make my life easier if they existed as a well-supported, well-integrated option.

(I'm aware that you can more or less implement them today as an add on, but for me the problem is that they aren't integrated into things like straightforward model forms and model formsets. You wind up having to do things by hand and the magic starts multiplying. About the time that I am defining classes on the fly, I start to think that I'm working too hard and there should be a simpler way.)

DjangoWhyReadonlyFields written at 00:55:03; Add Comment

2011-02-10

Using Django forms with HTTP GETs

All of the Django form examples and documentation that I've seen on the Django website (and in casual reading elsewhere) talks about using them for POST-based requests. This is the usual way to use forms in general, especially complicated data-driven forms, and anyways the whole REST style prefers putting things in the actual URL instead of in GET query parameters. But sometimes you really want to use GET-based forms; my case today was changing the sort order of a table of data.

(I know that good modern web developers do this in Javascript, perhaps on the fly, instead of with explicit forms (or at least without requiring the user to do explicit form submission). I'm a bit behind the modern web age.)

Django forms can be used with GET-based form submission pretty much just the same as with POST-based form submission. The form information is available in request.GET just as it is with request.POST, and you construct the bound form in the same way:

form = YourForm(request.GET)

The usual form CSRF protection is not required on GETs and not necessary if all of your GET-based URLs are idempotent and harmless (which they certainly should be) and don't cause any side effects.

The actual form class is defined in the same way as for POST-based forms, and used in templates in the same way too. For obvious reasons I don't think that it makes any sense to use a model-based form. I also don't think it makes sense to try to use a formset; with a formset of any decent size, you are going to get very big URLs and there are size limits.

The one slightly inobvious trick is deciding when to construct a bound form and when to construct an unbound form (ie, detecting when the user has submitted the form versus when they're looking at the initial page). POST based forms tell the two apart based on whether the request is a GET (initial page view) or a POST (form submission), but this doesn't work for GET-based forms for the obvious reason. The approach I am using is to check whether a form field appears as a query parameter in request.GET:

if request.method == 'GET' and \
   'param1' in request.GET:
      form = YourForm(request.GET)
      ...

(I check the request method too because I am notably paranoid.)

Any form field will do; any legitimate form submission will include all of them, even if some of them have blank contents. You need to check for form.is_valid() before looking at the form data, as usual.

Sidebar: the cheap trick to do changeable queryset ordering

Construct a tuple of choices values of the form:

SORTS = (('', '(none)'),
         ('field1', 'Friendly'),
         ('-field2', 'Labels'),
         ....)

Use this as the choices parameter in one or more forms.ChoiceField fields in your form (more than one field lets users have multiple levels of ordering, so you can do things like order first by state and then by name).

When you are using the form, do something like the following:

order = []
for field in 'first', 'second', 'third':
  if form.cleaned_data[field]:
    order.append(form.cleaned_data[field])

if order:
  qs = qs.order_by(*order)

(Then you continue on to display the objects from the queryset in your template.)

I believe that this is safe and will not let people edit the URL by hand to feed arbitrary fields and so on to .order_by(); form validation insures that only values from your SORTS list of choices are accepted.

DjangoFormsAndGet written at 01:36:07; Add Comment

2011-02-02

Django and primary keys versus surrogate keys

What Django does to my primary key problem is magnify the effects of changes and errors in two ways. First, the admin interface allows convenient direct insertion of data into the database, including immediately creating foreign key dependencies. People sometimes make mistakes when typing things in; the fewer steps between typing in something and having it wind up in the database, the fewer chances you have of catching your error. In a slower environment I might have noticed some of my typos after I had written the file of loader commands or SQL but before I had run it to insert things into the database, or perhaps at least before I started pointing foreign key relationships at those mistakes.

(Since straightforward SQL makes you re-type the foreign key value when using it, I'd also have had another chance to notice that I was making a stupid typo.)

Second, Django's admin interface makes it equally direct and simple to delete data and to cascade this deletion through to dependent records (and yes, it shows you everything that will be deleted and asks if you're sure). However, it has no equivalent way of doing a mass update to fix a primary key mistake (where you change all records pointing to the old record to point to the new one); you have to drop into Python (or direct SQL) to do that.

(In the SQL database I'm used to, attempting to do something that broke a foreign key relationship would normally error out, forcing you to stop and fix the whole situation. Possibly the Django admin interface can be tweaked to do that too; I didn't look closely.)

I don't think that Django is wrong for doing either of these things. Both are simply the consequence of making a better, more convenient interface for database administration operations.

DjangoPrimarySurrogate written at 02:33:07; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.