2022-03-19
Some problems that Python's cgi.FieldStorage
has
In my entry on our limited use of the cgi
module,
I praised cgi.FieldStorage
as a nice simple way to write Python CGIs that deal with parameters,
especially for POST
forms. Unfortunately there are some dark sides
to cgi.FieldStorage
(apart from any bugs it may have), and in
fairness I should discuss them. Overall, cgi.FieldStorage
is
probably safe for internal usage, but I would be a bit wary of
exposing it to the Internet in hostile circumstances. The ultimate
problem is that in the name of convenience and just working,
cgi.FieldStorage
is pretty trusting of its input, and on the
general web one of the big rules of security is that your input is
entirely under the control of an attacker.
So here are some of the problems that cgi.FieldStorage
has if you
expose it to hostile parties. The first broad issue is that FieldStorage
doesn't have any limits:
- it allows people to upload files to you, whether or not you
expected this; the files are written to the local filesystem.
Modern versions of
FieldStorage
do at least delete the files when the Python garbage collector destroys the FieldStorage object. - it has no limits on how large a
POST
body it will accept or how long it will wait to read aPOST
body in (or how long it will wait to upload files). Some web server CGI environments may impose their own limits on these, especially time, but an attacker can probably at least flood your memory.(The FieldStorage init function does have some parameters that could be used to engineer some limits, with additional work like wrapping standard input in a file-like thing that imposes size and time limits. For size limits you can also pre-check the Content-Length.)
Then there is the general problem that GET
and POST
parameters
are not actually really like a Python dict (or any language's form of
it). All dictionary like things require unique keys, but attackers
are free to feed you duplicate ones in their requests. FieldStorage's
behavior here is not well defined, but it probably takes the last
version of any given parameter as the true one. If something else
in your software stack has a different interpretation of duplicate
parameters, your CGI and that other component are actually seeing
two different requests. This is a classic way to get security
vulnerabilities.
(FieldStorage also has liberal parsing by default, although you
can change this with an init function parameter. Incidentally,
none of the init function parameters are covered in the cgi
documentation; you have to
read help()
or the cgi.py source.)
Broadly speaking, cgi.FieldStorage feels like a product of an earlier age of web programming, one where CGIs were very much a thing and the web was a smaller and ostensibly friendlier place. For a more or less intranet application that only has to deal with friendly input sent from properly programmed browsers, it's still perfectly good and is unlikely to blow up. For general modern Internet usage, well, not so much, even if you're still using CGIs.
(Wandering Thoughts is still a CGI, although with a lot of work involved. So it can be done.)