Two sorts of languages

February 27, 2008

Here is a thesis:

There are two sorts of languages: languages where writing your own code is (usually) almost as fast as using the builtin features, and languages where the builtins are an order of magnitude faster than doing the same work in your own code.

C is an example of the first sort of language; it is merely silly, not laughably absurd, to reimplement your own version of strchr(). Python is an example of the second sort of language; it's utterly stupid to write your own version of the builtin string .find() method.

(There are degrees of this, depending on how much slower attempts to reimplement the builtins wind up.)

This difference matters a lot for how you write programs. In the second sort of language, the way to make things fast is to use the builtins as much as possible by finding some way to express your problem in their terms; programs often spend a lot of time and effort connecting builtins together and mapping back and forth between them. Programs in the first sort of language are usually written in a much more direct way; people use the builtins, but mostly when they're convenient.

(For example, if you want to find the first occurrence in a memory block of a byte with values between (varying) A and B, a C programmer will probably write a memchr-alike but a Python programmer will probably build and match a regular expression on the fly.)

The difference also places the implementors and the users of the first sort of languages on a much more level playing field. In the second sort of language, there are strong limits on what you can do as a mere user; you can never create your own extension that runs as fast as the builtins, which means that user-level extensions are always to some degree second class citizens.

Comments on this page:

From at 2008-02-29 01:53:11:

Well said, but perhaps I'll add that the same old truth holds for C as well when it comes down to algorithms and fundamental parts of the standard library / operating system; use queue(3) instead of wrapping your own, probably slower and more error-prone implementation for double linked lists, do not ever try to impelement your own encryption algorithm, why write a directory traversal function when you have fts(3), it is foolish to try to write your own malloc-implementation, and so on.

- drear.

By cks at 2008-03-01 13:57:03:

The situation is qualitatively different with C. In C, rewriting the standard library is merely a waste of your time, and it is feasible to do this if you want or need different somewhat different features. In something like Python, rewriting the standard builtins as anything more than a proof of concept is completely impractical because of the performance implications.

As a result, people have written things like alternate regular expression libraries in C, and they have not in Python.

Written on 27 February 2008.
« Something that I do not understand
My likely Firefox 3 extensions »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Feb 27 23:54:55 2008
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.