The clarity drawback of allowing comparison functions for sorting
I've written before about my unhappiness that Python 3 dropped support for using a comparison function. Well, let me take that back a bit, because I've come around to the idea that there are some real drawbacks to supporting a comparison function here. Not drawbacks in performance (which are comparatively unimportant here) but drawbacks in code clarity.
DWiki's code is sufficiently old that it uses only .sort()
cmp
functions simply because, well, that's what I had (or at least
that's what I was used to). As a result, in two widely scattered
spots in different functions its code base contains the following
lines:
def func1(...): .... dl.sort(lambda x,y: cmp(y.timestamp, x.timestamp)) .... def func2(...): .... coms.sort(lambda x,y: cmp(x.time, y.time)) ....
Apart from the field name, did you see the difference there? I didn't
today while I was doing some modernization in DWiki's codebase and
converted both of these to the '.sort(key=lambda x: x.FIELD)
'
form. The difference is that the first is a reverse sort, not a
forward sort, because it flips x
and y
in the cmp()
.
(This code predates .sort()
having a reverse=
argument or at least
my general awareness and use of it.)
And that's the drawback of allowing or using a sort comparison function: it's not as clear as directly saying what you mean. Small things in the comparison function can have big impacts and they're easy to overlook. By contrast, my intentions and what's going on are clearly spelled out when these things are rewritten into the modern form:
dl.sort(key=lambda x: x.timestamp, reverse=True) coms.sort(key=lambda x: x.time)
Anyone, a future me included, is much less likely to miss the difference in sort order when reading (or skimming) this code.
I now feel that in practice you want to avoid using a comparison
function as much as possible even if one exists for exactly this
reason. Try very hard to directly say what you mean instead of
hiding it inside your cmp
function unless there's no way out.
A direct corollary of this is that sorting interfaces should
try to let you directly express as much as possible instead of
forcing you to resort to tricks.
(Note that there are some cases where you must use a comparison function in some form (see especially the second comment).)
PS: I still disagree with Python 3 about removing the cmp argument entirely. It hasn't removed the ability to have custom sort functions; it's just forced you to write a lot more code to enable them and the result is probably even less efficient than before.
|
|