A Python quoting irritation
I was writing code today where I needed to turn a single backslash ('\') into two backlashes ('\\') in order to quote it (so that the shell wouldn't eat it, through a combination of annoyances). My first attempt was:
s = s.replace(r"\", r"\\")
To my surprise, this gave me an error, and a peculiar one: Python
reported SyntaxError: invalid token
(at the closing ')'.) It took a
bunch of head-scratching to figure out that Python was blowing up on the
first string. I tried it with triple-quoted strings and that didn't work
either (but with a different error).
Ultimately, this is because Python uses backslash in strings for two
separate jobs: quoting end of string characters, and introducing special
characters like \n
. The former is active all the time; it is only the
latter that is turned off by the r
string modifier.
Where things get really confusing is the handling of '\\
' in r
strings. The double-backslash stops the second backslash from escaping
anything it normally would have, but is not turned into '\
' in the
parsed string; it remains intact as '\\
'.
So the ultimate answer I wound up with is:
s = s.replace("\\", r"\\")
You can write the second string as "\\\\"
, but that sort of thing
annoys the heck out of me so I would up leaving it as an r
string.
(This is all actually documented in the string literal section of the reference manual. If it is covered in the tutorial, I read it sufficiently long ago to have forgotten it.)
All of this (coupled with wrestling with shell quoting, which is how I wound up dealing with this) has reminded me of how much I hate complicated quoting schemes and the need for quoting in general.
|
|