A gotcha with
Textareas are one of those treacherous areas of web programming, because it is really easy to get them 95% right and then never notice that you've fumbled the remaining 5%. The problem area is textareas with initial content, for example blog comment previews; what almost completely works is to just put the raw content into the textarea in your HTML. This approach makes intuitive sense and even works fine if you test by throwing markup in, like '<h1>this is a test</h1>'.
There are only two noticeable problems with this, both of them obscure:
- any valid entity references in the text will be decoded to their real character values, so '<' turns into '<'.
- if there's a literal '</textarea>' in the text, it will become the end of the textarea (and your page layout may explode).
Since most people using your website don't do either of these, the simple solution works almost all of the time.
The real problem is that people develop the wrong mental model of what
<textarea> does. They think (just as I thought when I started to write
<textarea> means 'the browser treats as completely literal
all the initial content that I insert here'. The defect with this mental
model is exposed by putting a '
</textarea>' in the initial content
you insert into a textarea: how is the browser supposed to tell the
</textarea> you inserted (that it is supposed to ignore) apart from the
real </textarea> in your HTML that closes the textarea? The answer is
that it can't, and thus that the mental model is wrong.
What is actually going on is that browsers treat the contents
<textarea> as what the HTML 4.01 specification calls
'document text', in which character entities are allowed and
interpreted (technically markup is forbidden; in practice browsers
treat it as literal text). It has to be this way; since HTML has no
other quoting mechanism besides character entities, allowing character
entities is the only way to escape your inserted '
</textarea>' so it
doesn't terminate the textarea.
This means that you need to quote at least some things in your textarea
initial content; minimally '&' and '<', but if you already have a
general HTML quoting function (and you should), just use it and be done.
(The browser will strip this quoting when it creates the actual initial
contents, and thus you will get back the unquoted version when the user
POSTs for the next round.)