2012-06-30
What Linux distributions should do to help their Python 3 transitions
After writing yesterday's entry, I've wound up with some opinions on what Linux distributions should be doing as part of any transition to Python 3. This isn't aimed at getting the programs they package ported to Python 3 (that's really up to the distributions and to the authors of the programs), but instead at encouraging general adoption of Python 3 among the people who use the distributions and who may write some Python code.
(I could generalize this to all operating system distributions, but I suspect that only some Linux distributions even care about this.)
First, if you haven't already created it there should be a
/usr/bin/python2. I suspect that everyone already has this; certainly
Fedora and Ubuntu do.
Now, make a packaging policy that all packaged Python 2 programs must
specifically use 'python2' (eg, start with '#! /usr/bin/python2')
instead of plain 'python'. As part of the transition you're eventually
going to want to let users change things so that /usr/bin/python is
actually Python 3, and every Python 2 program that expects Python 2
to be /usr/bin/python is an obstacle to that. You can't do anything
about user programs, but you can and should make sure that your programs
aren't in the way.
(Once you do this, where /usr/bin/python points should be put under
the control of your distribution's system for handling alternates.)
Next, you should arrange your standard OS installs (what people get if they pick the defaults for, say, a Gnome desktop or a web server) install Python 3 alongside Python 2 if Python 2 is installed by the standard install. Or to put it another way, any standard install should wind up with either no version of Python installed or both Python 2 and Python 3 installed; you should not have just Python 2 installed. The goal here is to encourage your users who use Python (or who may start using Python) to use Python 3 instead of Python 2 by at least making Python 3 available by default.
Packaged Python modules and extensions are a tricky situation. What you'd like is for there to be module parity between Python 2 and Python 3 where possible, assuming that you have Python 3 installed at all; when you install a Python module, it installs both the Python 2 and Python 3 versions at the same time. Unfortunately I don't think that any current package management system is this smart. The Debian package system at least has a 'recommended packages' option, so I think that all Python 2 modules could list any Python 3 equivalent as a recommended package.
(Again this has the goal of getting rid of obstacles to people writing Python 3 code; you want to avoid a situation where Python 2 is more attractive than Python 3 because it has a bunch of modules installed but your Python 3 install is very bare-bones.)
Sidebar: the thorny issue of /usr/bin/python
The following is not going to be a popular opinion, but it's what I
think is practical: if Python 2 is installed on the system at all,
/usr/bin/python should point to it by default (although you should
allow the sysadmin to change it if they want).
Why is simple: all of the user-written Python programs and scripts
and whatnot that are out there in the world. They're all currently
using /usr/bin/python and expecting to get Python 2; if this becomes
Python 3 someday, the people with these programs are going to be what
sysadmins call 'very irritated'. If the system doesn't even have
Python 2 installed, you have a good excuse for it being Python 3;
/usr/bin/python can't run Python 2, because there's no Python 2 to
run. Otherwise there's no excuse.
(Yes, yes, making /usr/bin/python be Python 3 would smack everyone
in the nose with the existence of Python 3 and might encourage some
people to port their code to Python 3. Let me assure you that by and
large people do not appreciate being smacked in the nose for their
own good.)
2012-06-29
The magnitude of the migration to Python 3, illustrated
I was just reading Nick Coghlan's Python 3 Q & A. and ran across this:
Support in enterprise Linux distributions is also a key point for uptake of Python 3. Canonical have already shipped a supported version (Python 3.2 in Ubuntu 12.04 LTS) with a stated goal of eliminating Python 2 from the live install CD for 12.10. A Python 3 stack has existed in Fedora since Fedora 13 and has been growing over time, but Red Hat has not made any public statements regarding the possible inclusion of that stack in a future version of RHEL.
To give some other perspectives on the transition, I'll note that Ubuntu already has a tentative plan to move their Python 2 stack into the community supported universe repositories and only officially support Python 3 for their 14.04 release.
(It should be noted here that based on the release schedule, Ubuntu 14.04 is currently scheduled for April 2014, with a feature freeze probably happening around six months earlier.)
I'm afraid that my reaction to this involves a certain amount of grim laughter. To explain why, let's talk about the magnitude of the effort involved in making this sort of transition for a Linux distribution; in particular, I am going to look at Ubuntu 12.04 and Fedora 17.
On an Ubuntu 12.04 machine more or less configured as a desktop and with
a decent package selection installed, there are around 240 executables
installed that use Python 2, from around 90 different Ubuntu packages,
with over a hundred programs that are directly run by people (ie are in
/bin, /sbin, /usr/bin, or /usr/sbin). Now some of these are part
of Python or otherwise tied to it but there are fair number that are
not, including some large and important packages like Mercurial.
On my Fedora 17 workstation (which has quite a number of packages) there
are around 310 executables installed that use Python 2, from over 130
packages; over 200 programs are in /usr/bin or /usr/sbin. Again
there are large and significant packages involved, including a fair
number of important system management packages (especially yum, in
many ways the core package management system for Fedora).
But the bad news is not done yet. On both Ubuntu 12.04 and Fedora 17, there are no non-Python packages that use Python 3. Zero. Zip. None. The only Python programs that use Python 3 come from Python 3 itself. And if you want another bit of bad news, neither Fedora 17 nor Ubuntu 12.04 even install Python 3 by default. New, stock installed systems are Python 2 only. This is not a migration that is in progress; this is a migration that hasn't even started yet.
(By the way, as far as I am concerned this means that Ubuntu 12.04 can't be fairly described as 'shipping Python 3', merely as having it available.)
Sidebar: why not installing Python 3 by default matters
Imagine a new Python user on a Fedora or Ubuntu system. The most
convenient version of Python for them to start using is one that's
already on their system (especially the one that's called 'python').
The more you needs to know and the more you need to do to use another
version, the less likely you are to use it or try it out. Right now, as
a new Python user you have to go out of your way to know about Python 3,
install Python 3, and then use it. By contrast you can use Python 2 by
just typing 'python'.
(Among other things this affects people who are casually curous about how things are in Python 3. Casual curiosity doesn't survive work.)
Or in short: installing Python 3 by default makes it enticing. Not installing Python 3 by default makes it unenticing.
2012-06-19
How I'm doing AJAX with Django in my web app
Django has native AJAX support for serializing model objects and query results to and from JSON (or XML); various people have then added various featureful packages on top of this. For my web app all of this was overkill and too complex. What I needed for my limited client side form validation was a simple AJAX endpoint service, one that reported whether or not a potential login name was acceptable by returning a JSON dictionary with both a validity flag and a message to display to the user about the situation.
(I decided that I wanted to explicitly mark whether the login was valid or not rather than have client side code try to infer it from, say, whether or not there was a message. The latter seemed excessively fragile when being explicit was simple and cheap.)
So here's what I did (with the disclaimer that this may horrify
experienced Django developers). First off, for these form fields you're
going to want to pull as much of the field validation logic as possible
out of check_<field> functions in your regular form classes and into
reusable functions somewhere. Fortunately I was already mostly doing
this because I had several forms that all had a login field that they
had to validate.
(I couldn't extract all of the logic because I'd made the login field
a RegexField with a maximum length. This may have been a mistake;
automatic field validation is very convenient until it isn't.)
The main validation endpoint is exposed as a REST-style URL that
looks like '/check/login/<proposed login>'. This is connected to
a plain Django view function with a single parameter, the login.
The view function duplicates the RegexField's regular expression
check then if that passes calls my general login field validation
function, sets up a dictionary based on the result, turns it into JSON
with simplejson.dumps() (simplejson is from django.utils), and
finally returns that as the HTTP response (with the content type set
appropriately). The client side code picks it up from there and does
appropriate client side magic.
(Jquery normally wants to form parameterized JSON URLs using a query parameter, ie something like '/check/login/?<proposed login>'. Since I didn't know how to handle that in Django and didn't want to work it out, I punted and forced a more REST-ish URL with some hackery. I actually like the REST-ish URL better than the Jquery one.)
This left me with a small problem: how was the client side JavaScript
code supposed to know the (base) URL for the validation endpoint? My
solution is a hack. First, I created a second 'endpoint' view that
is attached to the plain '/check/login/' URL (without a login
parameter). This view does nothing, because its only purpose in life is
being a valid argument for the url tag in templates. Then in the HTML
template for the form page I added a JavaScript snippet that just had:
var loginBaseUrl = "{% url requests.views.checklogin_url %}";
The actual JavaScript code (which comes from a separate file at a
separate URL) then uses loginBaseUrl instead of trying to hard-code
the URL of the endpoint. The advantage of this hack is that the endpoint
URL is automatically adjusted for whatever environment the Django app is
running in (production, my testing, whatever) without my file of actual
JavaScript code having to be run through Django template processing or
any other form of (pre)processing.
(If you had a lot of endpoints you could come up with a more elaborate JavaScript data structure to hold this information and maybe a more elaborate scheme for propagating it around. Or at least something that doesn't involve random global JavaScript variables stuffed into your page HTML. However, this was a quick hack and it works, and I'm new to JavaScript so I get to do things the crude and direct way for at least a bit.)
PS: both my JavaScript code and our environment are so small that I'm not bothering to do any minimization, bundling, or the like of the code right now. In a larger environment I imagine that such 'ready it for deployment' processing of the JavaScript would be the natural place to insert endpoint URLs and the like.
2012-06-17
A realization about whether I can contribute to Python development
I semi-recently read Hynek Schlawack's My road to the Python commit bit, which includes an encouraging call to the readers to get involved in Python development. Reading the article briefly left me all fired up to start doing this, but shortly afterwards cold reality came crashing down on me as I realized that despite any enthusiasm I have, I can't really get involved in Python development in any useful way.
The problem is a simple one: I don't use Python 3 now and I'm rather unlikely to use it any time soon. At the same time, the Python developers have made it very clear that Python 2 is a dead end that is not being developed further and that Python 3 is the future; in fact, for Python development, Python 3 is the present. The conclusion is clear: if you want to contribute to Python development in any meaningful way, you need to be using and working with Python 3. Since I'm only working with Python 2, my ability to contribute to Python development is thus minimal.
(The counter argument is that it's still useful to triage bugs for Python 2, because some of them might get fixed in Python 2.7 point releases. The problem with this is that it doesn't change the fact that Python 2 is a dead end and working on dead ends can easily be described as 'demotivational'. This is especially so when the result of triaging a real bug may just be 'sorry, that's not severe enough to be fixed in Python 2'.)
This kind of makes me sad. Regardless of how crazy it would be for me to respond to Hynek's call (I am overcommitted as it is), knowing that I can't really do anything even if I wanted to is a little bit depressing. Partly it's depressing because it once again shows me how the world of Python development is pulling further and further away from the world that I operate in.
(Of course the real solution to this is to start working with Python 3. But that's hard for me for various reasons, including that a lot of the stuff that I work with is still Python 2 only and will probably never change. It's relatively rare that I start a totally green-field Python program.)