Django, the timesince template filter, and non-breaking spaces

February 3, 2016

Our Django application uses Django's templating system for more than just generating HTML pages. One of the extra things is generating the text of some plaintext email messages. This trundled along for years, and then a Django version or two ago I noticed that some of those plaintext emails had started showing up not as plain ASCII but as quoted-printable with some embedded characters that did not cut and paste well.

(One reason I noticed is that I sometimes scan through my incoming email with plain less.)

Here's an abstracted version of such an email message, with the odd bits italicized:

The following pending account request has not been handled for at least 1 week.

  • <LOGIN> for Some Person <user@somewhere>
    Sponsor: A professor
    Unhandled for 1 week, 2 days (since <date>)

In quoted-printable form the spaces in the italicized bits were =C2=A0 (well, most of them).

I will skip to the punchline: these durations were produced by the timesince template filter, and the =C2=A0 is the utf-8 representation of a nonbreaking space, U+00A0. Since either 1.5 or 1.6, the timesince filter and a couple of others now use nonbreaking spaces after numbers. This change was introduced in Django issue #20246, almost certainly by a developer who was only thinking about the affected template filters being used in HTML.

In HTML, this change is unobjectionable. In plain text, it does any number of problematic things. Of course there is no option to change this or to control this behavior. As the issue itself cheerfully notes, if you don't like this change or it causes problems, you get to write your own filter to reverse it. Nor is this documented (and the actual examples of timesince output in the documentation use real spaces).

Perhaps you might say that documenting this is unimportant. Wrong. In order to find out why this was happening to my email, I had to read the Django source. Why did I have to do that? Because in a complex system there are any number of places where this might have been happening and any number of potential causes. Django has both localization and automatic safe string quotation for things you insert in templates, so maybe this could have been one or both in action, not a deliberate but undocumented feature in timesince. In the absence of actual documentation to read, the code is the documentation and you get to read it.

(I admit that I started with the timesince filter code, since it did seem like the best bet.)

Is the new template filter I've now written sufficient to fix this? Right now, yes, but of course not necessarily in general in the future. Since all of this is undocumented, Django is not committed to anything here. It could decide to change how it generates non-breaking spaces, switch to some other Unicode character for this purpose, or whatever. Since this is changing undocumented behavior Django wouldn't even have to say anything in the release notes.

(Perhaps I should file a Django bug over at least the lack of documentation, but it strikes me as the kind of bug report that is more likely to produce arguments than fixes. And I would have to go register for the Django issue reporting system. Also, clearly this is not a particularly important issue for anyone else, since no one has reported it despite it being a three year old change.)

Written on 03 February 2016.
« You aren't entitled to good errors from someone else's web app
Some notes on SMF manifests (on OmniOS) and what goes in them »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Feb 3 23:42:32 2016
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.