Wandering Thoughts archives

2017-08-11

Some notes from my brief experience with the Grumpy transpiler for Python

I've been keeping an eye on Google's Grumpy Python to Go transpiler more or less since it was introduced because it's always been my great white hope for speeding up my Python code more or less effortlessly (and I like Go). However, until recently I had never actually tried to do anything much with it because I didn't really have a problem that it looked like a good fit for. What changed is that I finally got hit by the startup overhead of small programs.

As mentioned in that entry, my initial attempts to use Grumpy weren't successful, because how to actually use Grumpy for anything beyond toys is basically not documented today. Because sometimes I'm stubborn, I kept banging my head against the wall for long enough until I hacked together how to bring up my program, which gave me the chance to get some real world results. Basically the process went like this:

  • build Grumpy from source following their 'method 2' process (using the Fedora 25 system version of Go, not my own build, because Grumpy very much didn't work with the latter).
  • have Grumpy translate my Python program into a module, which was possible because I'd kept it importable.
  • hack grumprun to not delete the Go source file it creates on the fly based on your input. grumprun is in Python, which makes this reasonably easy.
  • feed grumprun a Python program that was 'import mymodule; mymodule.main()' and grab the Go source code it generated (now that it wasn't deleting said source code afterward). This gave me a Go program that I could build into a binary that I could keep and then run with command line arguments.

Unfortunately it turns out that this didn't do me any good. First, the compiled binary of my Grumpy-transpiled Python code also took about the same 0.05 of a second to start and run as my real Python code. Second, my code immediately failed because Grumpy has not fully implemented Python set()s; in particular, it doesn't have the .difference() method. This is not listed in their Missing features wiki page, but Grumpy is underdocumented in general.

(As a general note, Grumpy appears to be in a state of significant churn in how it operates and how you use it, which I suppose is not particularly surprising. You can find older articles on how to use Grumpy that clearly worked at the time but don't work any more.)

This whole experience has unfortunately left me much less interested in Grumpy. As it is today, Grumpy's clearly not ready for outside people to do anything with it, and even in the future it may well never be good at the kind of things I want it for. Building fast-starting and fast-running programs may not ever be a Grumpy priority. Grumpy is an interesting experiment and I wish Google the best of luck with it, but it clearly can't be my great hope for faster, lighter-weight Python programs.

My meta-view of Grumpy is that right now it feels like an internal Google (or Youtube) tool that Google just happens to be developing in a public repository for us to watch.

(In this particular case my fix was to hand-write a second version of the program in Go, which has been part irritating and part interesting. The Go version runs in essentially no time, as I wanted and hoped, so the slow startup of the Grumpy version is not intrinsic to either Go or the problem. My Go version will not be the canonical version of this program for local reasons, so I'll have to maintain it myself in sync with the official Python version for as long as I care enough to.)

Sidebar: Part of why Grumpy is probably slow (and awkward)

It's an interesting exercise to look at the Go code that grumpc generates. It's not anything like Go code as you'd conventionally write it; instead, it's much closer to CPython bytecode that has been turned into Go code. This faithfully implements the semantics of (C)Python, which is explicitly one of Grumpy's goals, but it means that Grumpy has a significant amount of overhead over a true Go solution in many situations.

(The transpiler may lower some Python types and expressions to more pure Go code under some circumstances, but scanning the generated output for my Python program suggests that this is uncommon to rare in the kind of code I write.)

Grumpy codes various Python types in pure Go code, but as I found with set, some of their implementations are incomplete. In fact, now that I look I can see that the only Go code in the entire project appears to be in those types, which generally correspond to things that are implemented in C in CPython. Everything else is generated by the transpiling process.

GrumpyBriefExperience written at 02:36:07; Add Comment

2017-08-05

I've been hit by the startup overhead of small programs in Python

I've written before about how I care about the resource usage and speed of short running programs. However, that care has basically been theoretical. I knew this was an issue in general and it worried me because we have short running Python programs, but it didn't impact me directly and our systems didn't seem to be suffering as a result of it. Even DWiki running as a CGI was merely kind of embarrassing.

Today, I turned a hacky personal shell script into a better done production ready version that I rewrote in Python. This worked fine and everything was great right up to the point where I discovered that I had made this script a critical path in invoking dmenu on my office workstation, which is something that I do a lot (partly because I have a very convenient key binding for it). The new Python version is not slow as such, but it is slower, and it turns out that I am very sensitive to even moderate startup delays with dmenu (partly because I type ahead, expecting dmenu to appear essentially instantly). With the old shell script version, this part of dmenu startup took around one to two hundredths of a second; with the new Python version, things now takes around a quarter of a second, which is enough lag to be perceptible and for my type-ahead to go awry.

(This assumes that my machine is unloaded, which is not always the case. Active CPU load, such as installing Ubuntu in a test VM, can make this worse. My dmenu setup actually runs this program five times to extract various information, so each individual run is taking about five hundredths of a second.)

Profiling and measuring short running Python programs is a bit challenging and I've wound up resorting to fairly crude tricks (such as just exiting from the program at strategic points). These tricks strongly suggest that almost all of the extra time is going simply to starting Python, with a significant amount of it spent importing the standard library modules I use (and all of the things that they import in turn). Simply getting to the quite early point where I call argparse's parse_args ArgumentParser method consumes almost all of the time on my desktop. My own code contributes relatively little to the slower execution (although not nothing), which unfortunately means that there's basically no point in trying to optimize it.

(On the one hand, this saves me time. On the other hand, optimizing Python code can be interesting.)

My inelegant workaround for now is to cache the information my program is producing, so I only have to run the program (and take the quarter second delay) when its configuration file changes; this seems to work okay and it's as least as fast as the old shell script version. I'm hopeful that I won't run into any other places where I'm using this program in a latency sensitive situation (and anyway, such situations are likely to have less latency since I'm probably only running it once).

In the longer run it would be nice to have some relatively general solution to pre-translate Python programs into some faster to start form. For my purposes with short running programs it's okay if the translated result has somewhat less efficient code, as long as it starts very fast and thus finishes fast for programs that only run briefly. The sort of obvious candidate is Google's grumpy project; unfortunately, I can't figure out how to make it convert and build programs instead of Python modules, although it's clearly possible somehow.

(My impression is that both grumpy and Cython have wound up focused on converting modules, not programs. Like PyPy, they may also be focusing on longer running CPU-intensive code.)

PS: The new version of the program is written in Python instead of shell because a non-hacky version of the job it's doing is more complicated than is sensible to implement in a shell script (it involves reading a configuration file, among other issues). It's written in Python instead of Go for multiple reasons, including that we've decided to standardize on only using a few languages for our tools and Go currently isn't one of them (I've mentioned this in a comment on this entry).

StartupOverheadProblem written at 00:59:49; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.