Another advantage of Python builtins

July 28, 2008

I've talked before about the speed advantage that Python builtins have. But speed isn't the only way that Python privileges things written at the C level; as dict.setdefault() illustrates, Python makes a useful atomicity guarantee for them that it does not make for methods written in Python itself.

Does this guarantee matter? I think that it does, because it is simultaneously useful and cheap. A concurrent program can avoid locking when dealing with shared data built carefully from builtin types, and duplicating the effects of this in your own Python code would be fairly expensive, especially in a non-threaded program.

(Given the limitations of Python concurrency and thus that most Python programs aren't threaded, performing very well in a non-threaded environment is quite important in practice.)

This also illustrates once again that it can matter a lot to know how things are implemented. If you are writing a threaded program, knowing whether method calls on a shared data structure are concurrency safe or need to be guarded with locks is vital. If a module's documentation gives you no information on thread-safety (and few do), you really need to know how it is implemented, and a straightforward Python implementation of it is not at all equivalent to a C implementation.

(There are some tricky cases where a module is effectively implemented partly in C and partly in Python. Fortunately most such commingled modules seem to do relatively little in Python.)


Comments on this page:

From 65.102.131.198 at 2008-07-28 12:06:29:

Is there actually a guarantee in CPython that they're atomic, or is that a side-effect of implementation? For that matter, is it guaranteed that they won't be coded in Python in the future. High performance atomic variables (even if they're limited to a small set of types) are very useful, I'd just be slightly worried that a new release would change the semantics in a way that might not surface without very heavy testing. My paranoia about threading bugs is probably a bit much here and the trade-off (if there is one) seems like it might be worth it.

In terms of portability I really don't know what the state of current Python implementations is, so whether this changes it in a way that's worth caring about isn't clear to me. Any pointers?

- Kate

From 65.102.131.198 at 2008-07-28 12:16:13:

Hmm, I'm missing a question mark in there...

Just to clarify, I'm not as down on this as I may have sounded. I had to deal with the both original Ruby interpreter's molasses-racing-glaciers threading and gone to great lengths to avoid Perl's initial chaos with threads (thankfully over), so some kind of fast and simple atomic variable is very appealing.

By cks at 2008-07-28 23:30:17:

The long answer is in BuiltinsConcurrencyGuarantee.

My personal answer is that it depends on whether one is a pragmatist or a legalist. The legalist answer is that no, there is no guarantee because there is no statement in the documentation making the guarantee. The pragmatist answer is that yes there is a guarantee because the documentation is entirely silent and this is how CPython has behaved for a long time so people have come to count on it.

(I am a pragmatist so I say 'yes', although with caveats covered in BuiltinsConcurrencyGuarantee.)

Written on 28 July 2008.
« The yum versionlock problem
What you can (probably) count on for concurrency in Python »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Mon Jul 28 00:27:30 2008
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.