Categories: links, linux, programming, python, snark, solaris, spam, sysadmin, tech, unix, web.
|
2006-02-17 Some regular expression performance surprisesPart of the fun of performance tuning is all of the unexpected things you find out. Since I spent some of yesterday doing work in the guts of DWiki's DWikiText to HTML conversion, what I've been finding is some regular expression performance surprises. My big surprise was the relative performance of ' Hopefully everyone reading this knows that the Perl regexp engine
(which is more or less what's used in Python) checks ' Update: Daniel Martin has pointed out that Python doesn't actually
use the Perl regexp engine, just the Perl regexp syntax and semantics.
The CPython But what I suspect is less well known is something I stumbled over some years ago, when looking at the performance of our Usenet anti-spam filters: constant alternatives are significantly slower than separate regular expressions (for non-matching searches, but I suspect that this holds for matching searches too). Given:
It appears that current Pythons run ' The results flip if the matches are not constant; ' (And ' This leads to the obvious moral, which is also the first rule of performance tuning: measure, don't guess. (I'm still kind of bummed about the constant strings alternative performance drop, but I suppose if I was really annoyed I should be contributing patches to fix it.) (3 comments.)
python/RegexpPerformanceSurprises written at 02:40:48; Add Comment
|
These are my WanderingThoughts GettingAround This is part of CSpace, and is written by ChrisSiebenmann. * * * Atom feeds are available; see the bottom of most pages. Categories: links, linux, programming, python, snark, solaris, spam, sysadmin, tech, unix, web |