Categories: links, linux, programming, python, snark, solaris, spam, sysadmin, tech, unix, web.
|
2006-02-20 More on regular expression performanceMy problem with benchmarking is that it's tedious (and finicky). Good benchmarking mostly consists of twiddling my thumbs while my tests run, often yet again after some minor tweak or improvement or additional case. I can kill days running tests and staring at numbers and not really get anywhere (and sometimes I've had to). All that said, after my entry on some regular expression performance surprises, I got curious enough to do more detailed micro-benchmarking. With Daniel Martin's assistance I was able to check Perl as well as Python, and we turned up some surprises. What I looked at was various versions of constant or near-constant
alternates; the ' Things I've found:
Python compiles regexps in Python code, so you can look at this and
examine several steps of the result. The most interesting thing to
look at is probably the output of I am a bit peeved at the performance differences between the Perl and Python regexp engines. It used to be that if a Python program was too slow, I could immediately conclude a Perl (or Ruby or etc) version would also be too slow, which simplified life. (Indeed, I could usually conclude that any regexp based version would be too slow, because I figured everyone had highly optimized regexp engines.) Oh well. As I said myself, measure, don't guess. (And thank you, Daniel Martin, for writing the Perl version of the benchmark tests.) Sidebar: The experimental setup detailsThe test environments:
I've put up the rebench.py and rebench.pl test programs, if anyone wants to test themselves. (5 comments.)
python/MoreRegexpPerformance written at 01:34:31; Add Comment
|
These are my WanderingThoughts GettingAround This is part of CSpace, and is written by ChrisSiebenmann. * * * Atom feeds are available; see the bottom of most pages. Categories: links, linux, programming, python, snark, solaris, spam, sysadmin, tech, unix, web |