An illustration of the speed advantage of Python builtins
As a followup to my entry on the two sorts of languages, I decided to actually try
implementing the string .find()
method in Python to see how much
slower it is. The Python version uses the brute force C style approach,
roughly:
for i in range(0, len(src)-len(sub)+1): if src[i:i+len(sub)] == sub: return i return -1
I actually did three versions, each with a different if
test; the other two versions used src[i:].startswith(sub)
and
src.startswith(sub, i)
. The last strikes me as pretty close to
cheating, since it delegates a great deal of the work to builtins.
I benchmarked each version on various lengths of src
strings against a
five character sub
string, none of which contained the sub
string or
any characters from it. I'll skip making a big table, and just say that
the fastest of the three versions ranged between 2.9 times slower than
builtin .find()
(for 8 characters) and 373 times slower (at 32K), and
was already 35 times slower at only 128 characters.
The other versions were worse, sometimes much worse; at its slowest,
the worst of the three versions was 17,554 times slower than builtin
.find()
, and at only 1K it was still 458 times slower (versus a mere
174 times slower for the fastest version).
(I was surprised by which version turned out to be the slowest, although in hindsight I shouldn't have been.)
|
|