Wandering Thoughts archives

2012-11-25

More thoughts on why Python doesn't see much monkey-patching

In yesterday's entry I advanced the idea that a significant part of why Python doesn't see much monkey patching (and Ruby does) is that you can't monkey patch a lot of fundamental Python classes and types because they're implemented in C. In sober hindsight, I think that I'm understating two intangibles (one of which I touched on in passing) in favour of a technical explanation.

First, I think that the effects of culture matter more than I initially thought. Culture plays a strong role in shaping how we write code in a language, both explicit in what people tell you to do and not do and implicit in things like what tutorials present and what approaches they take to solve problems. To invert the saying that you can write Fortran in any language, the reality is that people don't and culture is a good part of why.

Second is the issue of syntax. Ruby has a clear and direct syntax for adding your own methods to outside classes, while Python does not. In Python you have to go out of your way to modify a class after it's been created and the syntax is at least somewhat awkward and indirect. It's probably not longer (in lines) than the Ruby version, but it's certainly less direct and obvious. Since I'm a big believer that syntax matters I think that this can't help but have an effect in the two languages. I don't know if Ruby's syntax steers people towards monkey patching, but how it's both inobvious and a pain in Python has to steer people away from it.

Sidebar: the syntax in each language

In Ruby, monkey patching looks like this:

class SomeClass
  def newfunction
    [whatever]
  end
end

(I believe this is correct; I'm taking it from online examples, since I'm not a Ruby programmer.)

I think that this is the exact same Ruby syntax as you use when defining a full class yourself.

In Python, the equivalent is:

def newfunction(self, ...):
  [whatever]
SomeClass.newfunction = newfunction

This is shorter but less direct, involves more repetition, and doesn't clearly mark the new function as purely a class method (to be really equivalent to the Ruby version we should do 'del newfunction' to clean it out of our namespace). Full-scale lambdas would make this slightly better because then you wouldn't need to define the function separately and then glue it on to the class, but it would still look nothing like the way you define classes themselves.

MonkeyPatchingIntangibles written at 22:45:42; Add Comment

The limits of monkey patching in Python

One of the things I find interesting is the question of why monkey-patching is a common thing in the Ruby world but not in the Python world. Certainly a significant part of this is cultural (Python culture is very much against monkey patching, Ruby culture seems to be completely accepting of it), but it's hard for me to believe that that's the whole story. People do things to solve problems and I don't believe that Python is magically without the problems that cause people to monkey-patch Ruby. I've recently come up with not so much a theory as an observation on this.

One of the things that people do with Ruby monkey-patching is adding convenience methods to core Ruby classes, things like strings and arrays. You can't do this in Python, because Python monkey-patching has a fundamental limit: you can't monkey-patch anything written in C. Well, you can't monkey-patch it unless it went out of its way to let you, and most modules written in C don't.

There's two aspects of this. To start with, a fair number of the interesting Python classes and modules are written in C, including (of course) all of the fundamental types. These are all de facto sealed from modification and given that Python's culture is against monkey-patching it's unlikely that a proposal to change that would be accepted (to put it one way). This means that you simply can't do a fair amount of the monkey patching that happens in Ruby.

More generally, that C-level things can't be monkey patched makes it at least somewhat dangerous to patch anything that might plausibly be turned into a C module even if it isn't one right now. This is probably especially likely to happen to popular data structures (any number of which have made journeys from Python to C). If such a transition does happen, your monkey patching will immediately fail and you'll have to find another way.

(I don't know if this really acts as a disincentive, though, because I'm not sure that people who might monkey patch such things are aware of this issue.)

MonkeyPatchingLimitation written at 03:38:21; Add Comment

2012-11-19

Python 3's print() annoys me (although maybe it shouldn't)

One of a number of changes in Python 3 that I have a visceral unhappy reaction to is the replacement of the print statement by the print() function. On the one hand I sort of understand the 'computer science' perspective on changing it to a function; it moves Python's explicit statements closer to being purely intrinsic language operators. On the other hand, I just don't like it.

I think that part of the reason I don't like it is that it remains more or less magic while hiding that magic. When print was a statement, it was clear that it was pretty unusual and special. Turning print() into a function does not really make it any less magical (or central), but it sort of pretends otherwise in my eyes; the magic is now hidden behind a function call. Part of this feeling probably comes because print() is a built-in function, which makes it specially privileged; I'd consider sys.print() to be less magical (although more stupid; printing things is a pretty common and important operation, so short names for it matter).

On a pragmatic level, though, I'm clearly wrong; Python 3 print() is significantly less magical than Python 2 print, at least in CPython. CPython 2 has special bytecodes for printing things and print compiles down to them, while in Python 3 uses of print() compile to ordinary (global) function calls (and you can even redefine print() yourself if you want to be perverse). The arguments and special behaviors are also more regular and more easily discoverable (as a function, print() has in-Python documentation that you can see with help()).

(That print() is a real function opens up heavy use of print() to various sorts of the usual optimizations, and means that it's intrinsically somewhat slower than the Python 2 version since it necessarily involves a function name lookup and a function call. It's unlikely that this will matter to any real code, but you never know.)

PS: I know, I know. Even Python 2 has plenty of magical functions in the global namespace, things like type() or str(). I don't claim that my annoyance at print() is rational and I probably wouldn't be annoyed if Python had always had a print() function. Which implies that part of my annoyance may be due to what I see as a basically pointless renaming of a builtin to a function, one that forces a whole bunch of noisy code changes for no really compelling reason. (There is an entire rant about language changes without strong reasons that could go here.)

Also, I just tend to think that print reads better because it stands out more than yet another function call.

Sidebar: why C's printf() doesn't irritate me similarly

The simple version is that printf() is clearly a library function. A large part of the reason that this works in C and doesn't work in Python is namespaces, in that Python has them and C doesn't. In C everyone can dump functions into what in Python is the global namespace; in Python, modules cannot do this, which makes builtin names like print() special. Another reason this works in C is that producing output is clearly out of scope of the core C language, which is very limited and confined (it doesn't even have dynamic memory allocation).

Python3PrintAnnoyance written at 02:39:11; Add Comment

2012-11-11

A reminder: string concatenation really is string concatenation

Once upon a time when I was starting to write Python, I scribbled down the following code:

def warn(s):
  sys.stderr.write(sys.argv[0] + ": " + s + "\n")

(More or less. My actual code had an error and so didn't even work.)

Many Python programmers are wincing, because of course string concatenation is both somewhat inefficient and not the idiomatic way to do this; you should be using % string formatting. But there's another somewhat more subtle reason to avoid code like this, one that I ran into recently when I stumbled over this code the hard way by having it blow up in my face.

The surrounding code went something like this:

try:
  o, r = getopt.getopt(....)
except getopt.error, cause:
  warn(cause)
  ....

This failed. You see, the subtle problem with string concatenation is that it really is string concatenation. Unlike % formatting, it will not try to str() objects to convert them to strings; if they are not strings already, it just fails. It is of course easy to overlook this if you usually give your code actual strings; passing in a non-string object that can be stringified may be an uncommon corner case that you don't test explicitly.

This code actually exposes an interesting effect of Python's slow changes between Python 1.x and Python 2. Back in the old days exceptions actually were strings instead of objects that can be string-ified, and so this code could work when fed one of those exceptions. I wrote the program this code appears in back in 2003 or earlier and I believe we were still using Python 1.5 at the time (although it wasn't the current version even then); the 1.5.2 version of the getopt module appears to still have been using string exceptions at the time. So this might have been less crazy back then than it appears now (although it was still the wrong way to do it plus my actual implementation had a bug).

StringConcatIsStringConcat written at 01:56:48; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.