Wandering Thoughts: Recent Entries For 2009/01/03

Categories: links, linux, programming, python, snark, solaris, spam, sysadmin, tech, unix, web.

2009-01-03

How to help programmers (parts 2 and 3): os.environ and sys.argv

As it happens, the os.listdir() problem is just the tip of the iceberg of Python 3's Unix problems. Here are two more ways that it helps Unix programmers, from the release notes:

Some system APIs like os.environ and sys.argv can also present problems when the bytes made available by the system is not interpretable using the default encoding. Setting the LANG variable and rerunning the program is probably the best approach.

Since the release notes are not explicit, let me fill them in with what happens in each case.

If you have environment variables with un-decodable contents, Python 3 will pretend that they don't exist (and in fact they don't as far as it is concerned; they never made it into the os.environ data structure). This is worse than the os.listdir() case, because there is no way to work around it in your Python program; the behavior is hard-coded into the C source of the posix module. The only good news is that Python 3 doesn't remove these environment variables from the environment it passes to programs it executes via things like os.system() and os.popen().

For sys.argv, any un-decodable command line arguments (such as oddly encoded filenames) cause your Python program to abort with a message like 'Could not convert argument 2 to string'. This happens whether or not you ever import the sys module, as it is hard coded very early on in CPython's startup. For bonus points, the error message makes no attempt to identify what is producing it (it doesn't even mention that it is being produced by Python 3).

(System administrators and anyone else who deals with complex, multi-layered systems have a special sort of affection for unidentified error messages.)

As Ian Bicking noted in the comments on the os.listdir() problem, the real solution here is alternate bytes-based interfaces to both os.environ and sys.argv that (at least on Unix) would be the 'real' versions. But that would require Python 3 admitting that Unix is not all Unicode, which seems unlikely right now.

python/ArgvEnvironProblem written at 00:24:02; Add Comment

These are my WanderingThoughts
(About the blog)

GettingAround
Full index of entries
Recent comments

This is part of CSpace, and is written by ChrisSiebenmann.
Twitter: @thatcks

* * *

Atom feeds are available; see the bottom of most pages.

This is a DWiki.
(Help)

Categories: links, linux, programming, python, snark, solaris, spam, sysadmin, tech, unix, web

Search:
By day for January 2009: 1 2 3 4 5 6 7 8 9 10 11 12 14 15 16 17 18 19 20 22 23 24 25 26 27 28 29 30 31; before January; after January.

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.