An illustration of the speed advantage of Python builtins

February 29, 2008

As a followup to my entry on the two sorts of languages, I decided to actually try implementing the string .find() method in Python to see how much slower it is. The Python version uses the brute force C style approach, roughly:

for i in range(0, len(src)-len(sub)+1):
    if src[i:i+len(sub)] == sub:
       return i
return -1

I actually did three versions, each with a different if test; the other two versions used src[i:].startswith(sub) and src.startswith(sub, i). The last strikes me as pretty close to cheating, since it delegates a great deal of the work to builtins.

I benchmarked each version on various lengths of src strings against a five character sub string, none of which contained the sub string or any characters from it. I'll skip making a big table, and just say that the fastest of the three versions ranged between 2.9 times slower than builtin .find() (for 8 characters) and 373 times slower (at 32K), and was already 35 times slower at only 128 characters.

The other versions were worse, sometimes much worse; at its slowest, the worst of the three versions was 17,554 times slower than builtin .find(), and at only 1K it was still 458 times slower (versus a mere 174 times slower for the fastest version).

(I was surprised by which version turned out to be the slowest, although in hindsight I shouldn't have been.)

Written on 29 February 2008.
« My likely Firefox 3 extensions
Speed surprises in reimplementing the .find() string method »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Feb 29 23:43:19 2008
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.