Wandering Thoughts archives

2008-02-29

An illustration of the speed advantage of Python builtins

As a followup to my entry on the two sorts of languages, I decided to actually try implementing the string .find() method in Python to see how much slower it is. The Python version uses the brute force C style approach, roughly:

for i in range(0, len(src)-len(sub)+1):
    if src[i:i+len(sub)] == sub:
       return i
return -1

I actually did three versions, each with a different if test; the other two versions used src[i:].startswith(sub) and src.startswith(sub, i). The last strikes me as pretty close to cheating, since it delegates a great deal of the work to builtins.

I benchmarked each version on various lengths of src strings against a five character sub string, none of which contained the sub string or any characters from it. I'll skip making a big table, and just say that the fastest of the three versions ranged between 2.9 times slower than builtin .find() (for 8 characters) and 373 times slower (at 32K), and was already 35 times slower at only 128 characters.

The other versions were worse, sometimes much worse; at its slowest, the worst of the three versions was 17,554 times slower than builtin .find(), and at only 1K it was still 458 times slower (versus a mere 174 times slower for the fastest version).

(I was surprised by which version turned out to be the slowest, although in hindsight I shouldn't have been.)

python/BuiltinsSpeedIllustration written at 23:43:19; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.