Wandering Thoughts archives

2017-08-11

Some notes from my brief experience with the Grumpy transpiler for Python

I've been keeping an eye on Google's Grumpy Python to Go transpiler more or less since it was introduced because it's always been my great white hope for speeding up my Python code more or less effortlessly (and I like Go). However, until recently I had never actually tried to do anything much with it because I didn't really have a problem that it looked like a good fit for. What changed is that I finally got hit by the startup overhead of small programs.

As mentioned in that entry, my initial attempts to use Grumpy weren't successful, because how to actually use Grumpy for anything beyond toys is basically not documented today. Because sometimes I'm stubborn, I kept banging my head against the wall for long enough until I hacked together how to bring up my program, which gave me the chance to get some real world results. Basically the process went like this:

  • build Grumpy from source following their 'method 2' process (using the Fedora 25 system version of Go, not my own build, because Grumpy very much didn't work with the latter).
  • have Grumpy translate my Python program into a module, which was possible because I'd kept it importable.
  • hack grumprun to not delete the Go source file it creates on the fly based on your input. grumprun is in Python, which makes this reasonably easy.
  • feed grumprun a Python program that was 'import mymodule; mymodule.main()' and grab the Go source code it generated (now that it wasn't deleting said source code afterward). This gave me a Go program that I could build into a binary that I could keep and then run with command line arguments.

Unfortunately it turns out that this didn't do me any good. First, the compiled binary of my Grumpy-transpiled Python code also took about the same 0.05 of a second to start and run as my real Python code. Second, my code immediately failed because Grumpy has not fully implemented Python set()s; in particular, it doesn't have the .difference() method. This is not listed in their Missing features wiki page, but Grumpy is underdocumented in general.

(As a general note, Grumpy appears to be in a state of significant churn in how it operates and how you use it, which I suppose is not particularly surprising. You can find older articles on how to use Grumpy that clearly worked at the time but don't work any more.)

This whole experience has unfortunately left me much less interested in Grumpy. As it is today, Grumpy's clearly not ready for outside people to do anything with it, and even in the future it may well never be good at the kind of things I want it for. Building fast-starting and fast-running programs may not ever be a Grumpy priority. Grumpy is an interesting experiment and I wish Google the best of luck with it, but it clearly can't be my great hope for faster, lighter-weight Python programs.

My meta-view of Grumpy is that right now it feels like an internal Google (or Youtube) tool that Google just happens to be developing in a public repository for us to watch.

(In this particular case my fix was to hand-write a second version of the program in Go, which has been part irritating and part interesting. The Go version runs in essentially no time, as I wanted and hoped, so the slow startup of the Grumpy version is not intrinsic to either Go or the problem. My Go version will not be the canonical version of this program for local reasons, so I'll have to maintain it myself in sync with the official Python version for as long as I care enough to.)

Sidebar: Part of why Grumpy is probably slow (and awkward)

It's an interesting exercise to look at the Go code that grumpc generates. It's not anything like Go code as you'd conventionally write it; instead, it's much closer to CPython bytecode that has been turned into Go code. This faithfully implements the semantics of (C)Python, which is explicitly one of Grumpy's goals, but it means that Grumpy has a significant amount of overhead over a true Go solution in many situations.

(The transpiler may lower some Python types and expressions to more pure Go code under some circumstances, but scanning the generated output for my Python program suggests that this is uncommon to rare in the kind of code I write.)

Grumpy codes various Python types in pure Go code, but as I found with set, some of their implementations are incomplete. In fact, now that I look I can see that the only Go code in the entire project appears to be in those types, which generally correspond to things that are implemented in C in CPython. Everything else is generated by the transpiling process.

python/GrumpyBriefExperience written at 02:36:07; Add Comment

Link: Linux Load Averages: Solving the Mystery

Brendan Gregg's Linux Load Averages: Solving the Mystery (via, and) is about both the definition and the history of load average calculations in Linux. Specifically:

Load averages are an industry-critical metric – my company spends millions auto-scaling cloud instances based on them and other metrics – but on Linux there's some mystery around them. Linux load averages track not just runnable tasks, but also tasks in the uninterruptible sleep state. Why? I've never seen an explanation. In this post I'll solve this mystery, and summarize load averages as a reference for everyone trying to interpret them.

In the process of doing this, Brendan Gregg goes back to TENEX (including its source code) for the more or less original load average. Then he chases down the kernel patch from October 1993 that changed Linux's load averages from purely based on the size of the run queue to including processes in disk wait. It goes on from there, including some great examples of how to break down a load average to see what's contributing what (using modern Linux tracing tools, which Gregg is an expert on). The whole thing is really impressive and worth reading.

(Gregg's discussion is focused on Linux alone. For a cross-Unix view, I've written entries on when the load average was added to Unix and the many load averages of different Unix strains. In the latter entry I confidently asserted that Linux's load average included 'disk wait' processes from the start, which Gregg's research has revealed to be wrong.)

links/LinuxLoadAveragesMystery written at 00:01:14; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.