Wandering Thoughts archives

2012-06-21

How I want to do redirects in Apache, especially in .htaccess files

One of the irritating things about Apache's handling of redirects is that it makes what I suspect is the most common case the more verbose and potentially complex to handle, especially if you want to put it in a .htaccess file instead of in the main Apache configuration. Since I just had to go through working this out for our use, I'm going to write it down in case I need it later.

Suppose that you have an Apache server and you want to redirect the local URL /~user/ off to http://someserver/url2/ by putting a .htaccess file in the user's web home directory; possibly you are even redirecting your own home page somewhere. Then what you generally want in the .htaccess file is:

RedirectMatch permanent /~user/.* http://someserver/url2/

This redirects /~user/ and all URLs under it to the single URL http://someserver/url2/, with a permanent redirection (status code 301) instead of Apache's default of temporary redirection (status code 302).

It's worth talking about what happens with several variants. First:

Redirect permanent /~user/ http://someserver/url2/

This is the same as the following RedirectMatch version, which may make what's going on clearer:

RedirectMatch permanent /~user/(.*) http://someserver/url2/$1

The difference is that this redirects a request for, for example, /~user/bar/ off to http://someserver/url2/bar instead of the base URL we provided. If you're going to do this you should know that Redirect has the same issue with matching trailing /'s on each side as the ProxyPass directive does.

The more tricky case is what you need to put in the URL-path (the second argument here) in a .htaccess file in a subdirectory. Suppose you write the redirect in your .htaccess as:

Redirect permanent / http://someserver/

What you will get from a request for /~user/ is a redirection to http://someserver/~user/. Despite this .htaccess being located at /~user/ (logically speaking) and only applying to URLs under it, everything after the / in the original URL-path has been taken off and put on the end of the target URL. Even if this actually works and is what you want, it's confusing and you should write the redirection with the real URL-path and URL it applies to spelled out explicitly.

(You can create even more confusing versions with RedirectMatch.)

This is mentioned in the documentation for the Redirect directive, although with different amounts of clarity and thoroughness depending on the version of Apache. See, for example, the Apache 2.4 Redirect documentation. And yes, this is inconsistent with how RewriteRule matches the URL-path.

As with rewrite rules, if you want to redirect just the main URL and not any sub-URLs you need to be explicit about this with a RedirectMatch:

RedirectMatch permanent /~user/$ http://someserver/url2/

This redirects only requests for /~user/ itself; a URL like /~user/page.html will either fail with a 404 or wind up showing the real page on your server (if it exists).

(Because the Redirect directives always use the full URL-path, all of this applies to to them regardless of whether they're in .htaccess files or in the main Apache configuration. All that being in a .htaccess file does for Redirect et al is limit what URL-paths they can apply to.)

Sidebar: Redirect and partial matches

Unlike rewrite rules, a Redirect only matches whole components, so that a Redirect for /~user/foo will not match a request for /~user/foobar. This is mentioned in the documentation from Apache 2.2 onwards and probably applied even before then (but I don't have any Apache 2.0 or earlier servers to test with). This does not apply to RedirectMatch, which will match partial components (so if you have a URL-path of /~user/foo it will match /~user/foobar).

You might wonder what happens to the 'bar' portion when you have a partial-component match in RedirectMatch. Why, I'm glad you asked. The answer is that it disappears (it is not appended to the target URL). In fact the behavior of RedirectMatch is basically that there is an implicit '.*' on the end of the URL-path, so my recommended RedirectMatch can in fact be rewritten as:

RedirectMatch permanent /~user/ http://someserver/url2/

Once again, I recommend being explicit and not using this trick.

ApacheRedirectHtaccess written at 00:52:37; Add Comment

2012-06-13

Some tricky bits in in-browser form mangling and validation

In the process of writing client side JavaScript to modify and validate forms, I've run into a number of tricky bits and gotchas. In my usual tradition I'm going to write them down here, if only so that I can look them up again in a year or two when I've forgotten them all again.

Like many web apps, ours has a core (server-side) form processing flow where we first display the initial blank form, then on form submission we validate it and if there are any errors we re-display the form with your submitted values filled in and any error messages showing. Repeat as many times as necessary until you either give up or submit a fully valid form. Our forms also have the common 'reset form to initial values' button; on the initial form the browser will blank all fields, and otherwise it resets the field values to their initial (potentially erroneous) state and values. In general, most of the complications in client side form handling are ultimately caused by these intermediate 'submitted and redisplayed with errors' forms.

The first gotcha is the proper interaction of client side validation with form resets. On form reset, what you need to do is not blank out any error messages or 'that's good' notes you're showing, but restore the initial value of the status area. This is because you may be handling an intermediate form submission where the field had an error and thus the status area started out showing a (server side generated) error message. Since form reset restores the erroneous field value, you also need to restore the error message to go with it.

(There are various techniques for saving the initial error messages, ranging from doing it in JavaScript to simply changing a CSS style to hide them. I saved the HTML text in a JS variable because I'm a programmer.)

The next gotcha is the issue of hiding a form field that isn't currently applicable to the state of the form; for example, hiding a field for the graduate student number if the person requesting an account isn't a graduate student. There are two complications.

To start with, the rule for hiding fields is that you can't hide a form field that has an error (unless you know that your server side validation will ignore the field on form submission because it's not applicable). If you aren't doing client side validation of the field, this is based on the initial presence or absence of an error message (which may be inapplicable to the current state of the field, but it's the best you can do); if you are doing client side validation you can use the current state of the field if you have a valid result.

(If field validation requires a server callback you may not have valid results right now.)

Next, the existence of redisplayed forms means that you need to check the initial state of the form before hiding (or revealing) fields. For example, you could be redisplaying an incomplete account request form where the person's indicated that they're a graduate student; in that case, you definitely don't want to automatically hide the graduate student number field. Form resets complicate this further because the straightforward way of hooking into them happens before the form reset, so you will see the current value of the fields instead of the original value that you need to decide what to show or hide for the after-reset form state.

(The straightforward way of hooking into form reset is to attach to the click event for any type="reset" form field. Possibly there is a better way.)

PS: unless you're familiar with the tricky bits in this sort of stuff (and confidant that you're going to stay that way), I heavily recommend leaving yourself comments in the source code about them. This is probably not an issue for people who program client side JavaScript all the time, but I'm probably always going to be a periodic dabbler in this area.

Sidebar: On server side validation of inapplicable form fields

You might think that of course the server shouldn't bother validating form fields that are inapplicable given the other options that you've picked in the form. There are two problems with this. First, as a pragmatic matter many server side form validation frameworks simply don't do this. They validate fields bottom-up, first checking that the individual fields pass validation independently and only then letting you validate the entire collection of fields together. By the time you get control of the validation process an inapplicable field may already have failed its own basic validation.

(This is certainly the case with Django's form handling.)

Second, even as a theoretical matter this would clearly be a more complex approach because you now have to specify the interdependencies between fields (in some way short of writing fully custom form validation code for each form). There are all sorts of possible interdependencies and complexities here; you probably can't support them all and even supporting some of them gets you into an increasing mess of complexity. And you're doing all of this to your framework to support a rare case (when an inapplicable field has invalid content; normally an inapplicable field will be untouched).

On the whole it's much simpler for a framework to not try to support this with general code but instead just provide a way to let the programmer fully take over form validation themselves.

BrowserFormGotchas written at 15:33:07; Add Comment

Client side form validation can let people explore their options

Here's something important that I just realized about in-browser form validation: in-browser form validation lets people check their options out or to phrase it differently, in-browser form validation lets people change their mind.

Yeah, sure, technically plain forms let you change your mind too. But the problem with a plain form that asks people to make a choice, like picking a login name, is that the only way you have to find out if the login you're considering is taken or available is by submitting the form and submitting the form is making a commitment. Not only have you found out that the login is available, but you've committed yourself to it. That's not something that promotes exploration.

But in-browser form validation is different. Because in-browser form validation is not a form submission, you're not committed to whatever you put in a field that the system accepts; you can go back and change your mind (at least in any sane design). This makes it a great way to let people experiment or check several options to see what's available. They can change their mind as many times as they want and see the effects if they do.

As an obvious note: to make this work well your client side validation needs to not just report errors but also provide positive feedback, so that people know that their choice is a good one (well, acceptable to the system at least). And it should be relatively obvious feedback, because otherwise people may miss it.

(I more or less lucked into doing this in my web application; I put in visible positive feedback partly because it vaguely seemed like a good idea and partly because it let me see that my client side code was actually working and getting the right answers from my AJAX callback to the server.)

A related thing that in-browser JavaScript can help with is making any relationships and dependencies between form fields and form options clear to people (or at least clearer). Sure, your web page says which bits are mandatory and which bits are optional and that you should only fill in field B if you ticked option A, but we all know that not everyone is going to read all of that and get it right. In-browser JavaScript can just manipulate the form to make all of that directly visible; if option A isn't ticked, you hide field B entirely and so on.

(This can be extended to much more elaborate multi-section forms or drastically changing field options if you need to, although you're starting to run into the limits of graceful degradation and perhaps UI issues.)

ExploreWithClientValidation written at 00:31:52; Add Comment

2012-06-11

What bits of a form are useful to check on the client side

I spent part of today adding more JavaScript to our account request system, in the form of some client side form validation code. As part of this I wound up considering which bits of form validation are worth doing in the browser.

Of course, for some people the answer is 'all of them'. This may be justified or worthwhile in some environments, but my view is that I don't have the time or energy to write a second copy of form validation code (and any needed backend services to support it) for all fields. Until the day comes when I can write one set of validation code and have my framework magically generate both client side and server side code from it (along with the AJAX callback services necessary to support it), I'm going to need to pick and choose 'high value' form validation on the client side.

My conclusion is that the fields that are really worth adding client side validation for are the fields that people can't completely check for themselves just from inspection. This sounds abstract and obscure, so let me give a concrete example. Our 'request an account' form asks for both your name and the login that you want; on the server, we validate that you supplied a name and that the login you asked for is a valid login that is not already taken (or requested). Someone filling in the form can easily see whether or not they've supplied their name, but they have no way of knowing whether or not the login they've picked is already taken. Client side validation of the login gives people immediate feedback on this without them having to go through the slow process of form submission and examining the results. As a side effect, it lets them check several different logins so they can pick the one they like best (because, unlike form submission, they are not committed to a choice that the system accepts).

(By contrast, client side validation of the name field would just be telling people 'hey, the name field is still blank', which is something they can see for themselves if they look.)

A related approach would be to figure out what problems the submitted forms are most likely to have, on the grounds that these give the highest payoff from early detection and feedback (in the form of users having to go through fewer cycles of 'submit form, get errors, try to fix errors, repeat'). I think that this is going to get you more or less the same results unless your form is unclear (or confusing), but the advantage is that if you wanted to you could drive this from logging actual validation errors on the server.

(This is obvious but worth saying: client side validation is never a replacement for server side validation, it's only a supplement. You always have to have full server side validation, so the question is how much immediate in-browser validation you add on top of it.)

WhatToValidateClientside written at 23:27:59; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.