Why things should be in Python's standard library

September 23, 2010

There's another advantage of having things in the standard library: if they're in the standard library, they'll get used. This is what you want when the standard library module is the right solution to its problem (and a terrible idea if the module is a bad solution), especially when there are worse alternatives.

There are a lot of problems that have a few good solutions and a lot of bad ones; a classical example is parsing XML (using regular expressions versus a real parser). If one of the good solutions is not in the standard library but you can build a bad solution from other standard library bits, it's pretty much guaranteed that lots of people will build versions of the bad solution; people solve a lot of their problems using whatever tools the standard library gives them.

(Some of the people will start down the bad road because they don't know any better and don't realize how much pain they're getting themselves into. Some will do it because it works well enough for their current needs and it is the fast way to solve their problem.)

For a sufficiently common problem, especially if the wrong ways are sufficiently unproductive (or are sufficiently bad ideas), the conclusion is that you want an implementation of the right way to be in the standard library so that people will use it instead of slapping together yet another cringe inducing implementation of a bad solution.

(If this is an obscure problem, well, you can't put everything in the standard library.)

One short way of putting this is that the standard library should have good implementations of common things that it is (too) easy to get wrong.

(By the way, it does no good to say that people should not be so foolish as to tackle these problems the wrong way. This may be technically correct, but it does not solve the social problem of getting people to produce good code.)

Sidebar: the drawback of complexity

I cheated a bit. It's not enough that the standard library module be the right way to solve its problem, because that's not sufficient by itself to get people to use it over the alternatives. It must be the easiest way to solve its problem (or at least look like it). There are two sides to this: the amount of code you have to write with and without the module, and the amount of mental work you need to do in order to put together a solution, ie the module's complexity.

The module can usually win on the amount of code you have to write (if it can't, it has serious problems). Some standard library modules have not been too successful on the complexity front, though.

(Examples can help. Especially examples of doing simple things.)


Comments on this page:

From 82.152.15.113 at 2010-09-23 04:46:53:

Generally, I agree with you about whether things should be in the standard library (stdlib) rather than live as third-party packages. I believe that infrastructure-type functionality (i.e. of the sort which will be used by a lot of libraries, rather than just applications) definitely belongs in the stdlib; anything else leads to unnecessary pain when integrating multiple libraries together.

Disclosure: I'm the maintainer of the logging package in the stdlib, but that's not the only reason I also like things to be in the stdlib - "batteries included" has worked very well for me, especially when I was starting out with Python.

A couple of things you said particularly caught my eye:

It must be the easiest way to solve its problem (or at least look like it).

and

Some standard library modules have not been too successful on the complexity front, though.

The logging package comes in for some pretty bad press as regards the second point, though I think I have tried reasonably hard to achieve the quality expressed in the first point.

In trying to understand why people perceive the logging package as complex and hard to use when it is fairly thoroughly documented and demonstrably reasonably easy to use, I am coming to the conclusion that some people are strongly swayed by aesthetic rather than practical considerations. Those people have decided that the stdlib way is not the way for them, on aesthetic grounds, even though from a functional and ease-of-use viewpoint the component provided is quite fit for purpose. (In some cases, the Not-Invented-Here mentality also plays into their world-view.)

If people feel that any stdlib component is sub-par for whatever reason, they should feel free to engage with the relevant maintainers to put their points of view across - in comp.lang.python or via bugs.python.org - because while not every incoming suggestion will be accepted, there will be a fair number which will be, leading to an improved stdlib for all.

From 207.61.230.154 at 2010-09-23 09:42:36:

I think the multiprocessing module is an example of what you're talking about. I think it's an example of a module being included in the stdlib after everyone was cobbling together their own (inferior, often) solutions for the problem domain.

Written on 23 September 2010.
« The mysteries of video cards for Linux
Frames were never necessary for menus and tables of contents »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu Sep 23 00:07:35 2010
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.