A Python quoting irritation

October 6, 2006

I was writing code today where I needed to turn a single backslash ('\') into two backlashes ('\\') in order to quote it (so that the shell wouldn't eat it, through a combination of annoyances). My first attempt was:

s = s.replace(r"\", r"\\")

To my surprise, this gave me an error, and a peculiar one: Python reported SyntaxError: invalid token (at the closing ')'.) It took a bunch of head-scratching to figure out that Python was blowing up on the first string. I tried it with triple-quoted strings and that didn't work either (but with a different error).

Ultimately, this is because Python uses backslash in strings for two separate jobs: quoting end of string characters, and introducing special characters like \n. The former is active all the time; it is only the latter that is turned off by the r string modifier.

Where things get really confusing is the handling of '\\' in r strings. The double-backslash stops the second backslash from escaping anything it normally would have, but is not turned into '\' in the parsed string; it remains intact as '\\'.

So the ultimate answer I wound up with is:

s = s.replace("\\", r"\\")

You can write the second string as "\\\\", but that sort of thing annoys the heck out of me so I would up leaving it as an r string.

(This is all actually documented in the string literal section of the reference manual. If it is covered in the tutorial, I read it sufficiently long ago to have forgotten it.)

All of this (coupled with wrestling with shell quoting, which is how I wound up dealing with this) has reminded me of how much I hate complicated quoting schemes and the need for quoting in general.

Written on 06 October 2006.
« Thoughts on machine identity
Weekly spam summary on October 7th, 2006 »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Oct 6 18:04:58 2006
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.