RPC is surprisingly expensive

April 24, 2007

Once, long ago, I did a distributed computing project course where I tried to speed up a drawing program by moving its expensive floating point calculations (I believe it needed to calculate square roots for something) to a machine with a much faster CPU and floating point. We used Sun's RPC/XDR for this because it was both a great match for what we were doing and the obvious choice at the time, among other reasons.

Rather to our surprise the drawing program just didn't get much faster, even in the most unfavorable situation we could contrive (running on a Sun 3/50 without a floating point unit and sending the calculations to a much faster Sun 3 with a FPU). At the time the project sort of fizzled out amidst head-scratching, but now I can see that what I should have done (and what would have made an interesting report) was to dive into figuring out why it wasn't going much faster.

(I am pretty sure that if I could rewind time I would find the finger of blame pointed squarely at the overhead of RPC/XDR marshalling and demarshalling, although other interesting things might have also come up.)

However, ever since then I have had it firmly imprinted on the back of head that RPC can be surprisingly slow, and thus can be a source of hidden overhead in systems. Since all sorts of things involve 'RPC' in some form, even as general synchronous query/response message passing, this can very useful to remember.

For example, and what brought this to mind, consider the issue of memcached versus caching in SQL servers. Since normal SQL queries are already pretty time consuming, the wire format used to talk to the SQL server is probably more designed for things like system independence than high-speed, low-overhead marshalling. By contrast, with memcached you can store data blobs in a format that is as close to your memory layout as possible, and thus get demarshalling overhead down very low.

Comments on this page:

From at 2007-04-25 10:57:46:

I don't buy that marshelling/demarshelling is going to outweigh network latency as the source of any problem, at least without decently collected data. And while XDR isn't "the best" serialization protocol, it's not slow, ASN.1 is significantly more complicated and I don't hear the LDAP/SNMP guys complaining about ASN.1 encoding speed overhead (complexity overhead, yes).

Personally I think a lot of RPC stuff fails because calling a function locally is vastly different than sending some data across the network and waiting for a response ... and RPC pretends they behave identicaly.

I also find it hard to believe the SQL problem is in the marshelling layer, it's not like they are turning everything into XML tokens or something. Much more likely are: 1) memcached doesn't have the complexity of an RDBMS. 2) memcached is often closer, in terms of the network, to the app. 3) memcached being closer, in terms of required usage, to the app.

James Antill - http://www.and.org/
By cks at 2007-04-25 21:33:20:

The program was trying to do the distribution over a local network, although it was 10 megabit/second Ethernet (which wasn't old back when we were trying this); at this point I can't remember if we tried timing null RPC operations to see how fast they could go. Unfortunately all I can do by now is speculate, since the environment where I could measure is long gone.

The 'network transparency is impossible in practice' failings of RPC are another issue, one that I agree with you on.

Written on 24 April 2007.
« Extra security systems for Unix should be explicit, not implicit
What do Unix errno values mean? »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Tue Apr 24 22:55:11 2007
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.