A little gotcha with os.path.join

March 25, 2006

For my sins, I am one of those people who doesn't always read Python's fine documentation carefully enough. For example, I wrote a large chunk of code using the helpful os.path.join() before I noticed a little gotcha: it's very helpful. In particular:

Joins one or more path components intelligently. If any component is an absolute path, all previous components are thrown away, and joining continues.

(Right there in black and white in the fine documentation.)

This means that you cannot do:

fpath = os.path.join(root, untrusted)

At least, you can't do this safely and have it do what you probably think it does.

The other gotcha to remember for os.path.join is that no matter how convenient it looks on a Unix machine, urls are not paths. If you use os.path.join() on URLs and ever run on a Windows machine, the results are not likely to be pleasant. You probably want the urlparse module's urljoin() function, but note that it behaves much like os.path.join when handled absolute second parts, so:

>>> urljoin("http://host/a/b", "/d/e/f")
'http://host/d/e/f'

This is convenient if you are expecting it.

Written on 25 March 2006.
« Setting up to build RPMs
Weekly spam summary on March 25th, 2006 »

Page tools: View Source.
Search:
Login: Password:

Last modified: Sat Mar 25 02:59:15 2006
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.