Why I don't like Python 3 dropping the comparison function for sorting
One of the changes that Python 3 has made is that, to quote the documentation:
builtin.sorted() and list.sort() no longer accept the cmp argument providing a comparison function. Use the key argument instead. [...]
I feel unreasonably annoyed about this change. At least on the surface there's no obvious reason why; basically all of the uses of a comparison function I've ever used are to pick a specific field out, and that's handled much better by the key argument. However, I've recently figured out what irritates me about this: it couples data and behavior too closely.
In the new world, there are three ways to create a sort ordering. If
your ordering depends on explicit fields (possibly modified), you can
use a straightforward key function. If the ordering of a data element
is strictly computable from a single element (for example, a 'distance'
metric that's easy to determine), you can use a key function which
synthetically computes an element's ordering and returns it. And if
neither of these holds and you can only really determine a relative
ordering, you can define a __lt__
method on your objects.
The problem with the last approach is that, of course, you can only have
one __lt__
method and thus only one sort ordering. What's happened
is that you've been forced to couple the raw data with the behavior of
a particular sort ordering. Getting around this requires various hacks,
such as synthetic wrapper objects with different __lt__
functions.
(The other problem is that your data needs to be actual objects. While this is usually the case for anything complex enough that you only can do a relative ordering, sometimes you're getting the data from an outside source and it would be handy to leave it in its native form.)
While this is only a theoretical concern for me, it still irritates me a bit that Python 3 has chosen to move towards closer, less flexible coupling between data and ordering. I maintain that the two are separate and we can see this in the fact that there are many possible orderings for complex data depending on what you want to do with it.
By the way, I can see several reasons why Python 3 did this and I sympathize with them (even if I still don't like dropping cmp). The Python 3 documentation notes that key is more efficient since it's called only once per object you're sorting. On top of that, it's relatively easy to make mistakes with complex cmp functions that create inconsistent ordering, which potentially causes sorting algorithms to malfunction mysteriously.
Comments on this page:
|
|