Wandering Thoughts archives

2005-06-16

AJAX vs Dialups

AJAX is short for 'Asynchronous Javascript And XML', the common term for the technology behind highly interactive web sites like Google Maps and Google Mail. Given that the features AJAX enables (from the large to the small) are very appealing to designers, we're pretty much guaranteed to see more and more use of it on web sites.

But please don't reach for AJAX too fast, because there is such a thing as being too interactive.

AJAX's interactivity comes through communication, and communication takes bandwidth. While it'd be nice if everyone coming to your web site had lots of bandwidth, it's not true (unless you want to make it true by driving away everyone else).

Let's take an example: using AJAX to implement incremental searches. The search box on your web pages uses AJAX to notice when I start typing and does a callback to your web server so it can show me matching results; once I've typed enough to pull up what I want, I can just go there.

So I start typing, entering 'p'. Lightning-fast, your highly interactive AJAX wakes up and sends the request back to your web server. Of course there are a lot of pages that match such a broad criteria, so the reply is not short (the RD light of my modem goes on solid). As I add a 'y' and a 't' the whole process repeats, possibly colliding with the data transfer for the initial 'p' in the process.

This hypothetical web site's great interactivity hasn't helped me, it's frustrated me. Search has turned into a laggy experience where I have to wait for the application to catch up to my typing. The slower a typist I am, the worse it may be; if I type fast I have at least a chance of outracing the AJAX over-interactivity.

So: don't be too interactive. If your AJAX needs results from your web server, you probably can't keep up with the user's interactions in real time. Don't try; wait a bit, let the user get a bit of a head start, give some feedback every so often, and reserve your big efforts for when the user has paused. (Pauses in user input are your big hint that the user is waiting for you now.)

Google Suggest shows another solution to this: don't return interactive results until they're small enough to be useful. (In a search interface I do ask that you put up some feedback to the effect of 'searching for "py": too many results to show in the sidebar', so that I can tell the difference between lots of results and no results.)

Whichever you choose, people on dialups (like me at home on my poky 28.8K PPP link) will thank you for considering them. And you may discover that there are more of us than you thought, along with the people using your web site from halfway around the world, the unfortunates stuck behind choked up corporate Internet links, and so on.

Unsurprisingly I'm not the only person (or the first person) writing about this general issue; for example, Markus Baker's discussion is here.

You can read about other AJAX design issues here and here. (And this entirely neglects the collection of practical issues one faces when implementing AJAX in the presence of network delays.) Note to self: AJAX is complicated in practice.

You can read more about AJAX in the Wikipedia article.

AJAXvsDialups written at 23:03:47; Add Comment

2005-06-14

Pitfalls in generating Last-Modified:

Every HTTP reply from a web server can include a Last-Modified: header, which theoretically tells interested parties when the web page was last modified. This is really something that works best when the web server is just sending out static files; when it is generating dynamic content, like DWiki does, things get interesting.

The major use of Last-Modified: is to decide when a browser already has a current copy of the web page and doesn't need to fetch it again. Thus, with dynamic pages built from many pieces the Last-Modified: time needs to be the most recent modification time for all of the pieces. Then when any of the pieces that make a page are updated, changing the page's appearance, the page's Last-Modified: time will change and the browser will fetch a new copy.

This means DWiki can't just use the page's modification time (which is what gets shown in the 'Last Modified:' line at the bottom of most CSpace pages). DWiki pages are built from a cascade of templates and pieces, so as it builds a web page DWiki keeps track of the most recent modification time of all the files involved; change one template, and the updated time is automatically propagated through the system.

Or it would if there weren't some complications.

Authentication Soup

Being logged in to a DWiki, and who you're logged in as, not just can but will change the appearance of pages. It's not just big things, like being able to see a page's contents; it's everything from DWiki saying 'Welcome, <whoever>' in the top right corner down to whether you get a login form or a logout button. So if you log in or out and then refresh pages in your browser, the pages better change to look right for your current status; otherwise users start wondering if their login or logout actually worked.

In order to support Last-Modified: with authentication, DWiki would have to somehow arrange to track the last time you logged in or out of the DWiki. While this is theoretically possible, it would be a bunch of work and would involve trying to send a cookie to every visiting browser (and I refuse to do the latter).

Instead DWiki just mostly punts when authentication is enabled; regular DWiki pages get served without any Last-Modified: header. Fortunately modern browsers have another, better header called ETags: that they can use instead of Last-Modified: to see if they need to refresh a page.

Page List Soup

The other complication is easy to state: what's the modification time of a list of files?

Lists of files come up in several places in DWiki, most importantly when generating Atom syndication feeds. Atom feeds also complicate life because of two factors:

  • the Atom feed format requires some kind of 'most recently updated' timestamp.
  • the ETags: header's value is some identifying hash of the HTTP response's contents, so if the contents keep changing (because one generates a 'right now' timestamp as the most recently updated time in an Atom feed), the ETags: header will keep changing and everything will keep re-fetching Atom feeds and pages even when nothing has changed.

(Also, the RSS/Atom feed reader I use doesn't use ETags:, only Last-Modified:, so I have been trying to support Last-Modified: in my Atom feeds.)

The simple approach is to make the Last-Modified: value be the modification time of the most recently modified file in the list. Unfortunately this doesn't change when files are added or removed from the middle of the list, which makes it useless for most of DWiki's purposes.

At the moment DWiki folds in the modification times of all the directories it scans when looking at files during Atom feed generation (thereby currently missing directories that currently have no files in them at all). At other times it just punts.

Summary For Client Authors

If you're thinking of writing a feed reader client or a web browser, I have this to say: please just use the ETags: header. Since it's some hash value of the HTTP response's data, it's easy to generate and always accurate about whether or not the response is the same. Last-Modified: is essentially an approximation in everything except relatively simple situations or programs that go to obsessive amounts of work.

LastModifiedPitfalls written at 00:50:46; Add Comment

2005-06-11

Writing DWiki has been a very educational process. Mostly it has been educational about all sorts of irritations that I was previously happily ignorant of.

Take HTTP redirects, for example. (Please.)

To be fully specification-compliant, an HTTP redirect must be to a different URL than the current one, and must be to an absolute URL: it must redirect to http://host/some/where, not just /some/where. (Perhaps common browsers all accept relative redirects, but at least lynx complains about them.)

issue #1: when absolute URLs can't be

This presents a small problem for a program like DWiki: just what is the absolute URL of a DWiki page? The host is relatively easy, since modern HTTP requests include the host name (it's how name-based virtual hosts work).

But ... what about the port? Not every web server lives on port 80, especially a DWiki running in standalone test mode.

In theory the absolute URL should include the port (unless it's the default). In practice, every program I've tried gleefully adds the port itself if it is a non-standard port and you're referring to the same hostname. If you naievely generate redirects of http://host:port/..., what most programs try to get is http://host:port:port/..., which doesn't work too well.

Presumably people who want to run two web servers on the same host on different ports just lose.

Maybe this is even documented somewhere. (I jest; I looked, and failed to find anything obvious in the RFCs.)

Update, much later: I was completely mistaken here; see HostMistake.

issue #2: did you say different URL?

Why, yes. Different URL. Why is this irritating? Let's take logging in to this wiki as an example.

Login forms need to be POST forms, not GETs, because one does not want the password sitting in plaintext in URLs. The natural way to do it is to let the login form POST to the current page, which then just redisplays itself. Unfortunately if you then ask your browser to reload the resulting page (perhaps to see an updated edit of it), your browser warns you that you're about to resubmit a POST form and are you sure?

So: what we want to happen is to POST to the current URL, which instead of redisplaying itself in a POST context immediately redirects back to the GET version of itself.

Which is where it becomes very irritating that HTTP redirects have to go to a different URL.

DWiki 'solves' this issue by making up synthetic page names for processing logins (and logouts, which have the same problem). Fortunately it can guarantee that certain page names in its URL space will never be valid real DWiki pages, so it just uses some of them.

To get back to the page you were just reading, the login and logout forms add a hidden field to the form to say what the old page was. (Which means that the form has to be generated dynamically, because it's different on each page.)

HTTPRedirects written at 19:44:27; Add Comment

I find myself quite irritated with CSS lately, because I have been trying to be a good modern web-boy and style this place with CSS.

The problem with CSS is what it leaves out. Take a very simple example: this blog. Like many blogs, I want a two-column layout: blog entries on the left, a small sidebar about the blog on the right.

No problem in CSS: use two <div>s, set a width: or a min-width:, and then set one <div> as float: left; or so. And this works, as long as neither column overflows. Unfortunately, a) I am guaranteed to someday write a blog entry with a long unbreakable line, since I am going to quote code periodically and b) the user can make their browser window pretty darn narrow.

What I want to happen in this situation is for CSS to shrug and enlarge the whole thing so that the user has to scroll sideways to see the entire page. What I can get is one of:

  • the over-long line is truncated and is undisplayable.
  • the thing containing the over-long line is truncated but grows a scrollbar, so at least I can theoretically read it.
  • the over-long line scribbles itself all over the sidebar.
  • my sidebar stops being a sidebar and suddenly becomes a top-bar or a bottom-bar.

No, blech, ptui, and 'ha, you jest' respectively.

Those evil bad table things that we aren't supposed to use for layout? Surprise. They get this right.

So if you look at the HTML source for the blog, you will see a giant table.

(Now, perhaps there is some magic way to do this in CSS that actually works right and that no one mentions or uses. If so, please tell me about it. I would like to use CSS if I can.)

CSSIrritation written at 19:15:44; Add Comment

By day for June 2005: 11 14 16; after June.

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.