Trailing text, a subtle gotcha with Go's fmt.Sscanf
I've written some Go code
that wanted to do some simple scanf-like parsing of strings. When
I did this, I peered at the fmt package documentation for the Sscanf
function and confidently
wrote something like the following code:
_, e := fmt.Sscanf(input, "%d:%d", &hr, &min) if e != nil { return ..., e }
This code has a bug, or perhaps some people would call it an
unintended feature (for me it's definitely a bug). Namely, if you
feed this code the input string '10:15 this is trailing text
',
you will not get an error message. Your code will parse the 10:15
part of the input string and silently ignore the rest, or more
exactly Sscanf()
will.
At this point you might wonder how to either force Sscanf
to
produce an error on trailing text or detect that you have trailing
text. As far as I can tell there is no straightforward way, but
there are two options depending on how paranoid you want to be (and
where you get your input string from).
The simple option is to add an explicit newline to your format
string:
_, e := fmt.Sscanf(input, "%d:%d\n", &hr, &min)
This will parse an input string of '10:15
' (with no trailing
newline) without raising an error, and will detect most cases of
trailing input by raising an error. It won't detect the relatively
perverse case of something such as '10:15\n and more
', because
the '\n
' in the input matches the expected newline and then
Sscanf
stops looking.
(At the moment you can stack more than one \n
on the end of your
format string and still parse a plain '10:15
', so you can add
some more caution and/or paranoia if you want. Sufficiently perverse
input can always get past you, though, because as far as I can see
there is no way to tell Sscanf
that what you really mean is an
EOF.)
The complicated hack is to add an extra string match to your format string and look at how many items were successfully parsed:
n, _ := fmt.Sscanf(input, "%d:%d%s", &hr, &min, &junk) if n != 2 { return ..., error("Bad input") }
Among other drawbacks, we have to ignore the error that Sscanf
returns; it doesn't tell us whether or not the input was good, and
when it has an error value it may be meaningless for our caller.
My suspicion is that in cases like this I am probably pushing
Sscanf
too far and it's actually the wrong tool for the job. In
most cases the right answer is probably matching things with regular
expressions so that I can directly say what I mean. Or, in this
case, just using time.ParseInLocation
even though it's
less convenient and I'd have to do a bunch of manipulation on
the result.
(Regular expressions are probably slower than Sscanf
and I'd have
to use strconv to turn the
results into numbers, but my code here is not exactly performance
critical.)
|
|