I've been hit by the startup overhead of small programs in Python

August 5, 2017

I've written before about how I care about the resource usage and speed of short running programs. However, that care has basically been theoretical. I knew this was an issue in general and it worried me because we have short running Python programs, but it didn't impact me directly and our systems didn't seem to be suffering as a result of it. Even DWiki running as a CGI was merely kind of embarrassing.

Today, I turned a hacky personal shell script into a better done production ready version that I rewrote in Python. This worked fine and everything was great right up to the point where I discovered that I had made this script a critical path in invoking dmenu on my office workstation, which is something that I do a lot (partly because I have a very convenient key binding for it). The new Python version is not slow as such, but it is slower, and it turns out that I am very sensitive to even moderate startup delays with dmenu (partly because I type ahead, expecting dmenu to appear essentially instantly). With the old shell script version, this part of dmenu startup took around one to two hundredths of a second; with the new Python version, things now takes around a quarter of a second, which is enough lag to be perceptible and for my type-ahead to go awry.

(This assumes that my machine is unloaded, which is not always the case. Active CPU load, such as installing Ubuntu in a test VM, can make this worse. My dmenu setup actually runs this program five times to extract various information, so each individual run is taking about five hundredths of a second.)

Profiling and measuring short running Python programs is a bit challenging and I've wound up resorting to fairly crude tricks (such as just exiting from the program at strategic points). These tricks strongly suggest that almost all of the extra time is going simply to starting Python, with a significant amount of it spent importing the standard library modules I use (and all of the things that they import in turn). Simply getting to the quite early point where I call argparse's parse_args ArgumentParser method consumes almost all of the time on my desktop. My own code contributes relatively little to the slower execution (although not nothing), which unfortunately means that there's basically no point in trying to optimize it.

(On the one hand, this saves me time. On the other hand, optimizing Python code can be interesting.)

My inelegant workaround for now is to cache the information my program is producing, so I only have to run the program (and take the quarter second delay) when its configuration file changes; this seems to work okay and it's as least as fast as the old shell script version. I'm hopeful that I won't run into any other places where I'm using this program in a latency sensitive situation (and anyway, such situations are likely to have less latency since I'm probably only running it once).

In the longer run it would be nice to have some relatively general solution to pre-translate Python programs into some faster to start form. For my purposes with short running programs it's okay if the translated result has somewhat less efficient code, as long as it starts very fast and thus finishes fast for programs that only run briefly. The sort of obvious candidate is Google's grumpy project; unfortunately, I can't figure out how to make it convert and build programs instead of Python modules, although it's clearly possible somehow.

(My impression is that both grumpy and Cython have wound up focused on converting modules, not programs. Like PyPy, they may also be focusing on longer running CPU-intensive code.)

PS: The new version of the program is written in Python instead of shell because a non-hacky version of the job it's doing is more complicated than is sensible to implement in a shell script (it involves reading a configuration file, among other issues). It's written in Python instead of Go for multiple reasons, including that we've decided to standardize on only using a few languages for our tools and Go currently isn't one of them (I've mentioned this in a comment on this entry).

Comments on this page:

By Twirrim at 2017-08-05 12:22:18:

I've been noticing something similar with an entry in my .bashrc on OS X. Every time I spawn a new local terminal (happens a lot), I get this palpable pause. At the moment I can't realistically remove the entry, may have to spend some time profiling it and maybe porting it to Go, or working out some other way around it.

By Twirrim at 2017-08-05 12:24:16:

One other thought.. you could always turn it in to a python persistent service, providing an API, and pull the information back via curl, or something ;)

Written on 05 August 2017.
« The problem with distributed authentication systems for big sites
Our decision to restrict what we use for developing internal tools »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sat Aug 5 00:59:49 2017
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.