Wandering Thoughts archives

2009-12-10

How not to copy a file to standard output in Python

Suppose that you are quickly bashing together a CGI that as part of its job has to spit out a file. When I needed to do this recently, I reflexively wrote more or less:

fp = open(fname, "r")
for line in fp:
    print line

(Because this was reading from a file, my usual objections to using file iteration don't apply.)

Somewhat later I got a politely phrased report from the person that this was for to the effect that the PDFs this CGI was supposed to hand out to people were corrupt. He even helpfully reported that testing with wget said that the files were some number of bytes larger than they should be, to the tune of one byte per line in the original PDF, which pointed me right at my stupid bug.

The problem is, of course, that all of the forms of reading lines at a time from a file keep the terminating newline on the line, and then print adds another one. The easiest solution is to use sys.stdout.write() instead of print.

(For some people, the easiest solution would be to use 'print line,' but I've never used that syntax feature and I don't particularly like it. I would rather use sys.stdout.write() just so that I'm explicit about it.)

In theory the more efficient way is to use read plus write to read in large sized but not memory-busting chunks. I was going to say that this requires handling short writes and buffering, but that's not correct; unlike the underlying stdio fwrite() routine, .write() on file objects always does a full write. Instead using this sort of buffering just requires slightly more worrying about efficiency than I was doing at the time.

(In theory this code has a second bug; I should be opening the file in binary mode just in case. In practice, I ignore binary mode; I am not writing Python code that will ever run on Windows machines.)

HowNotToCopyFile written at 00:28:18; Add Comment

2009-12-05

What version of Python is included in various current OSes

For my own curiosity, here is a rundown of what version of Python is in various current OS distributions, along with whether or not a version of Python 3 is available as an optional package.

(The version of Python is my best guess at what you get if you run plain 'python' at a command line.)

OS Python version Optional Python 3?
Solaris 10 update 8 2.4.4 (2.6 available in Blastwave) No
Red Hat Enterprise 5
(and CentOS 5)
2.4.3 No (it's not in EPEL)
Ubuntu 8.04 LTS 2.5.2 No
Ubuntu 9.10 2.6.4 Yes
Debian 5.0 (Lenny) 2.5.2 No
Debian unstable 2.5.4 No (but there's a version in 'experimental')
Fedora 12 2.6.2 No (but it will be in Fedora 13)
FreeBSD 8.0 2.6.2 (I think) Yes
Mac OS X 10.4.11 (Tiger) 2.3.5 no?
Mac OS X 10.5.8 (Leopard) 2.5.1 no?
Mac OS X 10.6 (Snow Leopard) 2.6.1 yes, apparently

(I apologize if I have slighted your favorite OS or Linux distribution; this is the subset of things that I either have machines running or know how to check. Feel free to add data in the comments.)

I care about long term supported OSes like RHEL and Ubuntu LTS because those are what we run. The front runner short-term OSes at least show where the wind is blowing for their longer-term compatriots, so it's pretty sure that the next version of Ubuntu LTS will have Python 2.6.x (or better) and some version of Python 3, and it's likely that the next version of RHEL will have 2.6.x+ and Python 3 as well.

(Solaris 10 is unlikely to ever update its version of Python, because Solaris 10 pretty much never updates anything. And no one has any idea at this point if there will be a 'Solaris 11' and if so, what it will look like or have.)

My understanding is that you need Python 2.6 if you're even going to start developing for a future Python 3 migration. Obviously having an optional version of Python 3 is even better. The slow uptake of new Python 2.x versions (and the very slow addition of optional Python 3 packages) is one reason that I am not very sanguine about Python 3's chances of general adoption any time soon, or about plans to stop developing Python 2.x.

PythonVersions written at 00:16:53; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.