Remembering that Python lists can use tuples as the sort keys

July 10, 2018

I was recently moving some old Python 2 code to Python 3 (due to a recent decision). This particular code is sufficiently old that it has (or had) a number of my old Python code habits, and in particular it made repeated use of list .sort() with comparison functions. Python 3 doesn't support this; instead you have to tell .sort() what key to use to sort the list. For a lot of the code the conversion was straightforward and obvious because it was just using a field from the object as the sort key. Then I hit a comparison function that looked like this:

def _pricmp(a, b):
  apri = a.prio or sys.maxint
  bpri = b.prio or sys.maxint
  if apri != bpri:
      return cmp(apri, bpri)
  return cmp(a.totbytes, b.totbytes)

I stared at this with a sinking feeling, because this comparison function wasn't just picking a field, it was expressing logic. Losing complex comparison logic is a long standing concern of mine, so I was worried that I'd finally run into a situation where I would be forced into unpleasant hacks.

Then I remembered something obvious: Python supports sorting on tuples, not just single objects. Sorting on tuples compares the two tuples field by field, so you can easily implement the same sort of tie-breaking secondary comparison that I was doing in _pricmp. So I wrote a simple function to generate the tuple of key fields:

def _prikey(a):
  apri = a.prio or sys.maxint
  return (apri, a.totbytes)

Unsurprisingly, this just worked (including the tie-breaking, which actually comes up fairly often in this particular comparison). It's probably even somewhat clearer, and it certainly avoids some potential comparison function mistakes

(It's also shorter, but that's not necessarily a good thing.)

PS: Python has supported sorting tuples for a long time but I don't usually think about it, so things had to swirl around in my head for a bit before the light dawned about how to solve my issue. There's a certain mental shift that you need to go from 'the key= function retrieves the key field' to 'the key= function creates the sort key, but it's usually a plain field value'.

Written on 10 July 2018.
« TLS Certificate Authorities and 'trust'
Some thoughts on performance shifts in moving from an iSCSI SAN to local SSDs »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Tue Jul 10 00:43:25 2018
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.