Wandering Thoughts archives

2006-03-25

A little gotcha with os.path.join

For my sins, I am one of those people who doesn't always read Python's fine documentation carefully enough. For example, I wrote a large chunk of code using the helpful os.path.join() before I noticed a little gotcha: it's very helpful. In particular:

Joins one or more path components intelligently. If any component is an absolute path, all previous components are thrown away, and joining continues.

(Right there in black and white in the fine documentation.)

This means that you cannot do:

fpath = os.path.join(root, untrusted)

At least, you can't do this safely and have it do what you probably think it does.

The other gotcha to remember for os.path.join is that no matter how convenient it looks on a Unix machine, urls are not paths. If you use os.path.join() on URLs and ever run on a Windows machine, the results are not likely to be pleasant. You probably want the urlparse module's urljoin() function, but note that it behaves much like os.path.join when handled absolute second parts, so:

>>> urljoin("http://host/a/b", "/d/e/f")
'http://host/d/e/f'

This is convenient if you are expecting it.

OsPathJoinGotcha written at 02:59:15; Add Comment

2006-03-23

We apologize for the disruption in syndication feeds

In the process of enabling caching in DWiki, I managed to illustrate the importance of thorough testing by failing to properly cache the real titles of entries and then not noticing, because that the Atom feeds stayed the same was the one thing I didn't test. (I carefully and somewhat obsessively tested that the rendering of all of the other pages did stay the same. I even have that automated.)

This caused the syndication feeds for WanderingThoughts to revert to plain filename based titles for eight hours or so, until I noticed (yet another example of ReadYourFeed; I did eventually, but not fast enough). So everyone just saw a gratuitous feed change or two, which I apologize for.

The effects on the LiveJournal feed were a little bit more dramatic. Despite the only change to the feed being the titles, LiveJournal decided to revive two entries from March 7th and 8th and arbitrarily give them today's date, effectively regurgitating stale content on the readers of the LiveJournal feed. This irritates me, but probably not as much as it irritates you.

(I call it arbitrarily because the entries have proper (old) timestamps in the Atom feed, and they're not at the top of the Atom feed either. LiveJournal's syndication stuff mystifies me on a regular basis. Hopefully it is not doing something as crazy as using the entry title as its internal identifier.)

The incident has also pointed out some problems in my caching implementation. In particular, I need to version the cached data so that a new DWiki version using a cache with old data silently ignores it instead of exploding spectacularly. (Alternately I need a more robust way of storing the data, because sooner or later I may forget to bump the data version number during an upgrade and then have the kabooms anyways.)

Caches are hard. Let's go shopping for faster CPUs.

FeedWhoops written at 04:44:05; Add Comment

2006-03-16

A tricky memoization decorator

This is the kind of hack where I present the Python code first and explain it later:

def memo1(func):
  def _wrap(self):
    r = func(self)
    setattr(self, func.func_name, \
            lambda : r)
    return r
  return _wrap

class Foo(object):
  ...
  def invar(self):
    return complex(self.field1) or \
           complex2(self.field2)
  invar = memo1(invar)

Here, invar is a method that computes some function of the object, with a value that never changes over the life of a given object. The computation is more expensive than we like, so we don't want to call it more than once, but it may never be called so it's not worth precomputing the result in __init__. Clearly, memoization to the rescue.

Ordinary memoization would store the result in a variable somewhere, and check it, and so on. We don't bother; instead, memo1 stores the result by rewriting the object's invar method to be a lambda expression that just returns the value, which is about as efficient as you can get. (It introspects the function it's wrapping to find out what name to rewrite, which rules out certain method tricks.)

One subtlety here is that the lambda takes no arguments, not a self argument that it ignores. A function that is added to an object instead of a class is not subject to the normal process of method lookup, so does not get called with any implicit self.

I would rather do the memoization in something like memo1 than in invar itself, because doing it in a decorator keeps the invar function itself clean and obvious. While the lambda rewrite version does perform somewhat better than the traditional memoization scheme, the bigger attraction of it to me is that it's just simpler; the code doesn't have to play games to find a storage spot for the result. (Tastes are quite likely to differ here.)

Unfortunately there is no way to pull this trick with properties; once a variable in a class is a property, you cannot turn it into an ordinary variable for a single object. In some situations doing this would be nice, because calling a lambda function is a bit over two times slower than accessing an object variable on my machine.

Sidebar: a more traditional memo1

Because I wrote it to do timing tests, here's a version of memo1 that doesn't use lambda rewriting:

def memo1(func):
  s = []
  def _wrap(self):
    if s:
      return s[0]
    r = func(self)
    s.append(r)
    return r
  return _wrap

There is probably a cleverer place to store the return value.

TrickyDecorator written at 02:22:15; Add Comment

2006-03-09

Closures versus classes for capturing state

Possibly like a lot of other people, I have a 'thread pool' Python module for doing decoupled asynchronous queries of things. The basic interface looks like this:

wp = WorkPool(HOWMANY)
for w in what:
  wp.enqueue(wrap_up(w))
wp.waitDone()
res = wp.getanswers()

The important operation is .enqueue(<THING>), which queues up <THING> to be called in a worker thread; its return value is saved, to be disgorged later by .getanswers().

To keep things simple, <THING> is called with no arguments. But usually what I really want to do is to call the same function with a bunch of different arguments, for example calling is_in_sbl for a bunch of IP addresses. To make this work, I need to capture the argument (or arguments) and create a simple callable object that remembers them; in this example, that's done by wrap_up.

There's two ways of wrapping up this sort of state in Python: classes (okay, objects) and closures. In the class-based approach, wrap_up is actually a class; its __init__ method saves w, and its __call__ method uses the saved w to do the actual function call. By contrast, in a closure-based approach wrap_up is a plain function that creates and returns closures, like this:

def wrap_up(w):
  return lambda: real_func(w)

The class approach is the more obvious and natural one, because the state saving is explicit. But when I wrote two structurally identical programs, the first one using a class and the second one with closures, I found that I plain liked closures better.

For me, the drawback of the class based approach is that it's too verbose. The same wrap_up would be:

class wrap_up:
  def __init__(self, w):
    self.w = w
  def __call__(self):
    return real_func(self.w)

(and it gets longer if you have more than one argument.)

I have a visceral reaction to verbosity; in cases like this it feels like make-work. The retort is that closures are far more magic than classes for most people and it's better to avoid magic, even if the result is more verbose.

For now, my view is that programming should be pleasant, not annoying, and that the verbosity of the class-based approach to state capture annoys me. So closures it is.

(Perhaps the real answer is that WorkPool should have an .enqueue_call(func, *args, **kwargs) method, as well as the plain .enqueue() one. This strays towards the Humane Interfaces arguments, though, so I'm a bit nervous.)

CapturingState written at 02:50:11; Add Comment

2006-03-04

os.walk can be surprisingly slow

Yesterday's tracing made one particular bit of DWiki's page rendering pop out at me; almost 40% of the time for the page was going just to create the '(Previous | Next)' links at the bottom.

Because DWiki is entirely filesystem based, its 'metadata' is the state of the filesystem. Thus, creating the previous and next links requires walking WanderingThoughts' directory hierarchy to find out the order of the entries. Once I saw the times, I immediately started wondering if I was doing this traversal as fast as I could be.

The natural way to do directory traversal is with the os module os.walk generator. DWiki's traversal needed to generate both filenames and their modification times, so the core of the code was roughly:

for rt, ds, fs in os.walk(dpath):
  for f in fs:
    fn = os.path.join(rt, f)
    yield os.getmtime(fn), fn

I benchmarked this against three alternate versions: a recursive list version, an iterative list version, and an iterative generator version. All of them are a bit less than twice as fast as the os.walk version, with the generator version slightly faster than the others. (To my surprise.)

(Generator-ness or lack thereof isn't important for DWiki, and some other requirements make the iterative version simpler. In my test program the generator versions are converted to lists as part of the timing.)

Further experimentation showed that the os.getmtime call accounted for about 30% of the time of the os.walk version. Apparently in Python, stat() isn't as speedy as I might have hoped. Thus the major speedup in my alternate versions is that they can stat() each entry only once; the os.walk version has to stat each twice, once in os.walk to see if it's a directory and once in my code to get the modification time.

Doing os.walk without returning the modification time still wasn't as fast as my alternate versions, although it was much closer. I suspect that the remaining slowdown is because os.walk is a recursive generator, which means it has to trickle results up through several levels, and I add another level of yield on top.

Then I spent the rest of today's hacking time coding a neat scheme for pre-parsing DWiki's templates to speed them up, only to find out that it was slower than the current brute force template processing. That's sometimes how it goes with performance tuning; your best improvements are from ten minute hacks, and your hour long nice work gets discarded in the end.

SlowOsWalk written at 03:09:22; Add Comment

2006-03-03

Visualizing dynamic program flow

Someone I know has been talking to me recently about bravely becoming the first outside user of DWiki. As part of this, I realized he might like to have practical documentation about how DWiki templates get used in practice for things like producing this blog. Which made me realize that I'd forgotten how the template flow goes.

I could have re-read the templates and traced the execution path, but then I had a bright idea: why not make the program do this for me? So I instrument template processing to dump all the templates used, properly nested to show the expansion structure.

Once I had this visualization, more questions started popping out: what work was being done by each template? How long was it taking? Augmenting the tracing to add them wasn't too hard, and somewhat to my surprised it started bringing out useful information; the performance information has been particularly useful. It's evidently one thing to profile an application, but another thing to see where in the flow of execution the time is going (instead of in the code).

Visualizing program flow this way is sort of a variant of debugging via print; instead of trying to figure out how the program got somewhere, you're interested in seeing where it went in general.

To illustrate what I'm talking about, here's an example. This is the flow of templates for the web page for a single blog entry (this one in specific):

  1. dwiki.tmpl:
    1. html/dtd
    2. struct/head.tmpl: $var $var
      1. html/css: $var
      2. syndication/discovery.tmpl: atom::autodisc
      (1.3 msec for struct/head.tmpl)
    3. struct/header.tmpl:
      1. struct/breadcrumbs: breadcrumbs
      2. struct/loginstate.tmpl: $var
    4. dwiki/overrides.tmpl:
      1. Overrides/blog/all.tmpl:
        1. dwiki/standard.tmpl:
          1. views/normal-file.tmpl:
            1. wikiauth.tmpl: wikitext[4.1ms]
              1. comment/tools.tmpl: comment::countlink Empty
              (5.2 msec for wikiauth.tmpl)
            (5.5 msec for views/normal-file.tmpl)
          (6 msec for dwiki/standard.tmpl)
        2. Overrides/blog/sidebar.tmpl:
          1. struct/readme-dir.tmpl: inject::upreadme[9.4ms] (9.5 msec)
          search::enter (10 msec)
        (17 msec for Overrides/blog/all.tmpl)
      (18 msec for dwiki/overrides.tmpl)
    5. struct/datecrumbs.tmpl: blog::datecrumbs blog::prevnext[18ms] (18 msec)
    6. struct/bottom.tmpl:
      1. bits/pagetools.tmpl: pagetools[1.6ms] (1.7 msec)
      2. hist/locker.tmpl: hist::lockedby Empty
      3. struct/searchbox.tmpl: search::enter
      4. struct/loginbox.tmpl: auth::loginbox
      5. syndication/feeds.tmpl: atom::feeds[1ms] (1.1 msec)
      6. bits/lastmod.tmpl: lastmodified
      7. site-sig.tmpl
      (5.2 msec for struct/bottom.tmpl)
    (45 msec for dwiki.tmpl)

(Reading this and this will be helpful.)

The rough format is 'N. <template file>: <renderer> ...', interspersed with sub-templates; $var marks variable expansion, and Empty means the template ended up generating nothing (often because it wasn't applicable to the page). Times are omitted if they're less than a millisecond.

At 25 templates and 45 milliseconds, this is a fairly typical DWiki page. The sidebar means that WanderingThoughts has a more complex template cascade than ordinary pages, but frankly I like small modular templates. (A lot of them are only a line or two long.)

(And yes, I know that there are a number of inefficient bits in here. Figuring out how to speed DWiki up is on the to-do list, although I had not realized quite how time consuming the blog::prevnext renderer was before now.)

VisualizingProgramFlow written at 03:01:13; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.