In Python, strings are infinitely recursively iterable
Yesterday I posed the side question of what
happened when you called the following code with
def flatten2(inlst): olst =  for i in inlst: try: it = iter(i) except TypeError: it = None if it is None: olst.append(i) else: olst.extend(flatten2(i)) return olst
The intent of this code is to recursively flatten iterable things.
What I had expected to get when I called it with a string was
'b', 'c', 'd', 'e', 'f'], ie that it flattened the string instead
of leaving it alone (because strings are iterable). What I actually
got (and what you can get) is a RecursionError of 'maximum recursion
depth exceeded'. So why did this recurse endlessly?
The answer is that strings are not just iterable, they are infinitely
recursively iterable. With normal iterable containers, iterating
the container yields non-iterable items unless you've explicitly
put in iterable ones (such as a list inside another list); assuming
that you have not cleverly embedded a cycle in your container, our
recursive flattening will eventually bottom out and finish. This
is not the case for Python strings. When you iterate a multi-character
"abcdef", you get a sequence of single-character
"a" "b" "c" "d" "e" "f". However, these are still strings
and so they are still iterable; when you iterate the
you get back another single-character string
"a", which is also
iterable. And so
flatten2() chases down an infinite recursion of
trying to unpack single-character strings into non-iterable items,
which it will never succeed at.
Fixing this without side effects is a bit annoying. It's not enough
to immediately check that
inlst is length 1 and just return it
if so, because this fails on
flatten2(["abcdef",]) too, for that matter). I think you have to
check explicitly for length 1 strings (and bytes) and just return
them, which of course leaves you exposed to any other infinitely
recursively iterable types (should there be any other out there).
(I was about to gripe about
repr()'s representation of single
character strings but then I tested and it uses single quotes for
all strings, not just single-character ones. I need to remember
that in Python there is no 'character' type that's different from
strings, unlike in other languages such as Go, and so single quotes
still mean 'string'.)
Sidebar: It's trivially possible to create containers with cycles
>>> a = [1, 2, 3] >>> a.append(a) >>> print(a) [1, 2, 3, [...]]
More involved versions are left as an exercise for the reader.
Without contemplating it too hard, I think that you can only create cycles with mutable containers, and then only with some of them. Sets are mutable, for example, but they can only contain hashable items, which are generally immutable.
Comments on this page:Written on 26 February 2017.