A Python quoting irritation

October 6, 2006

I was writing code today where I needed to turn a single backslash ('\') into two backlashes ('\\') in order to quote it (so that the shell wouldn't eat it, through a combination of annoyances). My first attempt was:

s = s.replace(r"\", r"\\")

To my surprise, this gave me an error, and a peculiar one: Python reported SyntaxError: invalid token (at the closing ')'.) It took a bunch of head-scratching to figure out that Python was blowing up on the first string. I tried it with triple-quoted strings and that didn't work either (but with a different error).

Ultimately, this is because Python uses backslash in strings for two separate jobs: quoting end of string characters, and introducing special characters like \n. The former is active all the time; it is only the latter that is turned off by the r string modifier.

Where things get really confusing is the handling of '\\' in r strings. The double-backslash stops the second backslash from escaping anything it normally would have, but is not turned into '\' in the parsed string; it remains intact as '\\'.

So the ultimate answer I wound up with is:

s = s.replace("\\", r"\\")

You can write the second string as "\\\\", but that sort of thing annoys the heck out of me so I would up leaving it as an r string.

(This is all actually documented in the string literal section of the reference manual. If it is covered in the tutorial, I read it sufficiently long ago to have forgotten it.)

All of this (coupled with wrestling with shell quoting, which is how I wound up dealing with this) has reminded me of how much I hate complicated quoting schemes and the need for quoting in general.


Comments on this page:

By DanielMartin at 2006-10-13 00:00:22:

Perl behaves this way too - although it actually does slightly better than python, in that '\\' turns into a string with one backslash, not two, which is consistent with '\'' being a single-character string.

Personally, I wish perl behaved the way the shell does in this regard: inside single quotes (or whatever) nothing escapes. You want to include quotes? Use a different construct. (Perl's got plenty of quoting constructs; python has at least two choices for quotes)

It's just such a simple system.

By cks at 2006-10-13 10:47:09:

Python turns '\\' into a single backslash, which is what you'd expect. The one that I find weird is Python's handling of r"\\", because the backslash both does and doesn't escape things. The first backslash escapes the second backslash so the second backslash doesn't escape the quote, but both backslashes remain in the string. (I am used to quoting characters getting eaten.)

Even the Bourne shell has too many and too broken quoting rules. (To see this, look how ugly the results are when you try to work out a way to quote some arbitrary input.)

By DanielMartin at 2006-10-15 23:48:11:

I'd only expect python to turn '\\' into a single backslash because python treats '-quoted strings (almost) the way perl treats "-quoted strings.

Perl, however, treats '-quoted strings to mean "no escape processing here"; that is, the effect you get in python with r'' strings. Therefore, the proper analogue of '\\' in perl is r'\\' in python, which is as messed up as you mention.

Written on 06 October 2006.
« Thoughts on machine identity
Weekly spam summary on October 7th, 2006 »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Oct 6 18:04:58 2006
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.