2017-08-11
Some notes from my brief experience with the Grumpy transpiler for Python
I've been keeping an eye on Google's Grumpy Python to Go transpiler more or less since it was introduced because it's always been my great white hope for speeding up my Python code more or less effortlessly (and I like Go). However, until recently I had never actually tried to do anything much with it because I didn't really have a problem that it looked like a good fit for. What changed is that I finally got hit by the startup overhead of small programs.
As mentioned in that entry, my initial attempts to use Grumpy weren't successful, because how to actually use Grumpy for anything beyond toys is basically not documented today. Because sometimes I'm stubborn, I kept banging my head against the wall for long enough until I hacked together how to bring up my program, which gave me the chance to get some real world results. Basically the process went like this:
- build Grumpy from source following their 'method 2' process (using the Fedora 25 system version of Go, not my own build, because Grumpy very much didn't work with the latter).
- have Grumpy translate my Python program into a module, which
was possible because I'd kept it
import
able. - hack
grumprun
to not delete the Go source file it creates on the fly based on your input.grumprun
is in Python, which makes this reasonably easy. - feed
grumprun
a Python program that was 'import mymodule; mymodule.main()
' and grab the Go source code it generated (now that it wasn't deleting said source code afterward). This gave me a Go program that I could build into a binary that I could keep and then run with command line arguments.
Unfortunately it turns out that this didn't do me any good. First, the
compiled binary of my Grumpy-transpiled Python code also took about
the same 0.05 of a second to start and run as my real Python code.
Second, my code immediately failed because Grumpy has not fully
implemented Python set()
s; in particular, it doesn't have the
.difference()
method. This is not listed in their Missing
features
wiki page, but Grumpy is underdocumented in general.
(As a general note, Grumpy appears to be in a state of significant churn in how it operates and how you use it, which I suppose is not particularly surprising. You can find older articles on how to use Grumpy that clearly worked at the time but don't work any more.)
This whole experience has unfortunately left me much less interested in Grumpy. As it is today, Grumpy's clearly not ready for outside people to do anything with it, and even in the future it may well never be good at the kind of things I want it for. Building fast-starting and fast-running programs may not ever be a Grumpy priority. Grumpy is an interesting experiment and I wish Google the best of luck with it, but it clearly can't be my great hope for faster, lighter-weight Python programs.
My meta-view of Grumpy is that right now it feels like an internal Google (or Youtube) tool that Google just happens to be developing in a public repository for us to watch.
(In this particular case my fix was to hand-write a second version of the program in Go, which has been part irritating and part interesting. The Go version runs in essentially no time, as I wanted and hoped, so the slow startup of the Grumpy version is not intrinsic to either Go or the problem. My Go version will not be the canonical version of this program for local reasons, so I'll have to maintain it myself in sync with the official Python version for as long as I care enough to.)
Sidebar: Part of why Grumpy is probably slow (and awkward)
It's an interesting exercise to look at the Go code that grumpc
generates. It's not anything like Go code as you'd conventionally
write it; instead, it's much closer to CPython bytecode that has been turned into Go code. This
faithfully implements the semantics of (C)Python, which is explicitly
one of Grumpy's goals, but it means that Grumpy has a significant
amount of overhead over a true Go solution in many situations.
(The transpiler may lower some Python types and expressions to more pure Go code under some circumstances, but scanning the generated output for my Python program suggests that this is uncommon to rare in the kind of code I write.)
Grumpy codes various Python types in pure Go
code, but as I found with set
, some of their implementations are
incomplete. In fact, now that I look I can see that the only Go
code in the entire project appears to be in those types, which
generally correspond to things that are implemented in C in CPython.
Everything else is generated by the transpiling process.
Link: Linux Load Averages: Solving the Mystery
Brendan Gregg's Linux Load Averages: Solving the Mystery (via, and) is about both the definition and the history of load average calculations in Linux. Specifically:
Load averages are an industry-critical metric – my company spends millions auto-scaling cloud instances based on them and other metrics – but on Linux there's some mystery around them. Linux load averages track not just runnable tasks, but also tasks in the uninterruptible sleep state. Why? I've never seen an explanation. In this post I'll solve this mystery, and summarize load averages as a reference for everyone trying to interpret them.
In the process of doing this, Brendan Gregg goes back to TENEX (including its source code) for the more or less original load average. Then he chases down the kernel patch from October 1993 that changed Linux's load averages from purely based on the size of the run queue to including processes in disk wait. It goes on from there, including some great examples of how to break down a load average to see what's contributing what (using modern Linux tracing tools, which Gregg is an expert on). The whole thing is really impressive and worth reading.
(Gregg's discussion is focused on Linux alone. For a cross-Unix view, I've written entries on when the load average was added to Unix and the many load averages of different Unix strains. In the latter entry I confidently asserted that Linux's load average included 'disk wait' processes from the start, which Gregg's research has revealed to be wrong.)