Wandering Thoughts archives

2022-03-19

Some problems that Python's cgi.FieldStorage has

In my entry on our limited use of the cgi module, I praised cgi.FieldStorage as a nice simple way to write Python CGIs that deal with parameters, especially for POST forms. Unfortunately there are some dark sides to cgi.FieldStorage (apart from any bugs it may have), and in fairness I should discuss them. Overall, cgi.FieldStorage is probably safe for internal usage, but I would be a bit wary of exposing it to the Internet in hostile circumstances. The ultimate problem is that in the name of convenience and just working, cgi.FieldStorage is pretty trusting of its input, and on the general web one of the big rules of security is that your input is entirely under the control of an attacker.

So here are some of the problems that cgi.FieldStorage has if you expose it to hostile parties. The first broad issue is that FieldStorage doesn't have any limits:

  • it allows people to upload files to you, whether or not you expected this; the files are written to the local filesystem. Modern versions of FieldStorage do at least delete the files when the Python garbage collector destroys the FieldStorage object.

  • it has no limits on how large a POST body it will accept or how long it will wait to read a POST body in (or how long it will wait to upload files). Some web server CGI environments may impose their own limits on these, especially time, but an attacker can probably at least flood your memory.

    (The FieldStorage init function does have some parameters that could be used to engineer some limits, with additional work like wrapping standard input in a file-like thing that imposes size and time limits. For size limits you can also pre-check the Content-Length.)

Then there is the general problem that GET and POST parameters are not actually really like a Python dict (or any language's form of it). All dictionary like things require unique keys, but attackers are free to feed you duplicate ones in their requests. FieldStorage's behavior here is not well defined, but it probably takes the last version of any given parameter as the true one. If something else in your software stack has a different interpretation of duplicate parameters, your CGI and that other component are actually seeing two different requests. This is a classic way to get security vulnerabilities.

(FieldStorage also has liberal parsing by default, although you can change this with an init function parameter. Incidentally, none of the init function parameters are covered in the cgi documentation; you have to read help() or the cgi.py source.)

Broadly speaking, cgi.FieldStorage feels like a product of an earlier age of web programming, one where CGIs were very much a thing and the web was a smaller and ostensibly friendlier place. For a more or less intranet application that only has to deal with friendly input sent from properly programmed browsers, it's still perfectly good and is unlikely to blow up. For general modern Internet usage, well, not so much, even if you're still using CGIs.

(Wandering Thoughts is still a CGI, although with a lot of work involved. So it can be done.)

CGIFieldStorageIssues written at 21:47:21;


Page tools: See As Normal.
Search:
Login: Password:

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.