Ramblings on handling optional arguments in Python

July 14, 2011

Pete Zaitcev:

Every time I think switching to Ruby, smth pops like [Detecting unspecified method arguments in Ruby]

This gives me a platform to spring off from. Python has the same 'problem' as Ruby here, ultimately because their language designers have picked the same solutions to the problem of optional arguments for functions. Roughly speaking, I can think of three ways of handling optional arguments in a language: declare them explicitly, use default values, or explicitly handle at least some of the argument list yourself.

A language that lets you explicitly declare optional argument can also support an explicit and direct check to see which optional arguments were supplied. I believe that Lisps have historically supported this approach. However, this is extra syntax and an extra language feature; both Python and Ruby have opted not to have this and to only support optional function arguments indirectly, through either of the other two approaches.

Handling part of the argument list yourself leaves you to decode it into actual variables (present and absent). This is annoying, so most people wind up using arguments with default values; essentially this pre-decodes the optional portion of the argument list for you. But it does leave you with a problem, one that Lisp people note explicitly in their documentation: since you're using the argument's default value as a signal that the caller didn't supply it, you need a way of distinguishing between the argument not being supplied and your caller supplying a value that happens to also be your 'argument was not supplied' default value.

This is a specific instance of a general situation where you need a sentinel value that can be distinguished from all valid values. Since Python is dynamically typed, often the simplest way to get such a value is to create one yourself and the simplest way to do that is just to create a new instance of object so that you get a unique value:

no_arg = object()
def optarg(a, b=no_arg):
  if b is no_arg:
    ....

(You might as well use is here, because this is one of the rare cases where you really do want object identity instead of object equality.)

You don't need to create your own new sentinel value if you can come up with a convenient existing value that no caller will ever supply (perhaps because it's invalid). None is a popular choice for this role, although I tend to think it's not an ideal option.

(The problem with None as a sentinel value is that it's easy for a None value to creep into your program through various bugs, oversights, or just other functions that return None under various unusual circumstances. The result is a peculiarly hard to spot situation where how the code reads is not how the code actually works; you think that you're calling optarg() with two arguments because that's what's in the source code, but in actuality you're calling it with one. If the effects of this are indirect and only become visible much later this could be quite a head scratcher bug.)

If you use this idiom a lot, it might be worthwhile to create a function so that you can use a clear name:

def unique_value():
  return object()

no_arg = unique_value()

(Trivia: there is a tiny reason to make this a function instead of a subclass of object.)

These sentinel values are not completely ideal; for example, they are not self-documenting if you display them during debugging. But doing better requires more verbosity and repetition, at least in Python.

(This is one of the cases where it would be nice to have a real built in and fully supported symbol type, and yes syntactic sugar does matter.)

(This issue came up in passing before in DefaultArgumentsTrick.)

Sidebar: on unique values in Python

When I say 'unique value', I mean 'some object that we can reliably distinguish from all other objects in the Python universe'. Note that certain sorts of built in values (and thus objects, because in Python everything is an object) that you might think are unique are not in fact unique in CPython, because the CPython interpreter plays tricks behind your back. A full discussion of these tricks is an entry in and of itself, but the short version is that you're safe from them if you use a mutable type. Instances of object are not really mutable in a conventional sense (as I discovered and then much later figured out why), but they're close enough for this.


Comments on this page:

From 24.227.244.203 at 2011-07-14 12:01:56:

A tiny reason? Not creating instance dicts is a HUGE reason if you'll be using lots of these.

By cks at 2011-07-14 12:13:00:

You don't have to create instance dictionaries in a subclass, but that means that you have to use __slots__:

class unique_value(object):
  __slots__ = ()

Subclassing object does have the advantage that your unique values will at least give some clue to what they are when you print them during debugging.

From 82.247.112.90 at 2011-07-15 06:20:42:

Interesting point, about using a special no_arg value because None is so ubiquitous that it can mistakenly creep up in the code. Thanks!

From 65.172.155.230 at 2011-07-15 11:23:12:

The problem with doing this, instead of using None, is that you'll then get into situations where you have to call the function indirectly and do both cases. So instead of doing:

   def foo(blah, a, b, c):
     # blah
     ret = bar(a, b, c)
     # blah

...you'll now need to do:

   def foo(a, b, use_c=False, c=None):
     # blah
     if use_c:
         ret = bar(a, b, c)
     else:
         ret = bar(a, b)
     # blah

...and this is just for a single argument. As soon as you hit 2 arguments you'll want to hurt yourself, more than that and you'll probably start doing stuff to find the "no_object" values.

Written on 14 July 2011.
« Why I'm going to be skipping Fedora 15
Something to remember: HTML forms are anonymous »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu Jul 14 02:11:47 2011
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.