An illustration of the speed advantage of Python builtins

February 29, 2008

As a followup to my entry on the two sorts of languages, I decided to actually try implementing the string .find() method in Python to see how much slower it is. The Python version uses the brute force C style approach, roughly:

for i in range(0, len(src)-len(sub)+1):
    if src[i:i+len(sub)] == sub:
       return i
return -1

I actually did three versions, each with a different if test; the other two versions used src[i:].startswith(sub) and src.startswith(sub, i). The last strikes me as pretty close to cheating, since it delegates a great deal of the work to builtins.

I benchmarked each version on various lengths of src strings against a five character sub string, none of which contained the sub string or any characters from it. I'll skip making a big table, and just say that the fastest of the three versions ranged between 2.9 times slower than builtin .find() (for 8 characters) and 373 times slower (at 32K), and was already 35 times slower at only 128 characters.

The other versions were worse, sometimes much worse; at its slowest, the worst of the three versions was 17,554 times slower than builtin .find(), and at only 1K it was still 458 times slower (versus a mere 174 times slower for the fastest version).

(I was surprised by which version turned out to be the slowest, although in hindsight I shouldn't have been.)

Comments on this page:

From at 2008-03-01 01:37:59:

So as a long time C person, I assumed the big problem was that you were creating new string objects as a side effect of the loop and maybe incurring overhead from non-local accesses. So my first guess was:

def myfind1(src, sub):
   lensub = len(sub)
   max = len(src) - lensub + 1
   i = 0
   j = 0
   while True:
       while i < max and src[i] != sub[0]:
           i += 1
       if i > max:
       j = 0
       while src[i] == sub[j]:
           i += 1
           j += 1
           if j == lensub:
               return i - j
       i -= j
       i += 1
   return -1

...which is roughly what you'd write in C (although it requires a more verbose syntax). It performs a little better than the src[i:i+len] version, and significantly better than the other two.

But then implied the hybrids:
def myfind015(src, sub):
   lensub = len(sub)
   for i in xrange(0, len(src)-lensub+1):
       if src[i] == sub[0] and src[i:i+lensub] == sub:
          return i
   return -1

def myfind025(src, sub):
   for i in xrange(0, len(src)-len(sub)+1):
       if src[i] == sub[0] and src[i:].startswith(sub):
          return i
   return -1

...which both outperformed my attempt (but still worse than src.find(sub), unsurprisingly). Interestingly the two param. .startswith() was still worse with the hybrid approach.

If I was feeling more intelligent I might make an observation that all of the above involve a lot of python code, so all each are really testing is how the specific python interpreter happens to optimise the three attempts.

By cks at 2008-03-01 23:39:23:

My guess is that the hybrid versions perform better because they defer making an expensive string copy (by making a relatively cheap one). You could probably make them go slightly faster by putting sub[0] in a variable instead of recreating it all the time.

(CPython may optimize creation of single-character strings; I wouldn't be surprised if it was a pretty common operation.)

Written on 29 February 2008.
« My likely Firefox 3 extensions
Speed surprises in reimplementing the .find() string method »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Feb 29 23:43:19 2008
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.