Solving unexposed types and the limits of duck typing
When I ran into the issue of the re module not exposing its types, I considered several solutions to my underlying
problem of distinguishing strings from compiled regular expressions.
For various reasons, I wound up picking the solution that was the least
annoying to code; I decided whether something was compiled regular
expression by checking to see if it had a .match
attribute.
This is the traditional Python approach to the problem; don't check
types as such, just check to see if the object has the behavior that
you're looking for. However, there's a problem with this, which I can
summarize by noting that .match()
is a plausible method name for a
method on a string-like object, too.
Checking duck typing by checking for attribute names only works when you can be reasonably confidant that the attributes you're looking for are relatively unique. Unfortunately, nicely generic method names are likely to be popular, because they are simple and so broadly applicable, which means that you risk inadvertent collisions.
(A casual scan of my workstation's Python packages turns up several
packages with classes with 'match()
' methods. While I doubt that
any of them are string-like, it does show that the method name is
reasonably popular.)
You can improve the accuracy of these checks by testing for more than one attribute, but this rapidly gets both annoying and verbose.
(I'm sure that I'm not the first person to notice this potential drawback.)
Sidebar: the solutions that I can think of
Here's all of the other solutions that I can think of offhand:
- extract the type from the
re
module by hand:CReType = type(re.compile("."))
- invert the check by testing to see if the argument is a string, using
isinstance()
andtypes.StringTypes
, and assume that it is a compiled regexp if it isn't. - just call
re.compile()
on everything, because it turns out it's smart enough to notice if you give it a compiled regular expression instead of a string.
I didn't discover the last solution until I wrote this entry. It's now tempting to revise my code to use it instead of the attribute test, especially since it would make the code shorter.
(This behavior is not officially documented, which is a reason to avoid it.)
|
|