Wandering Thoughts archives


The HTML <pre> element doesn't do very much

These days I don't do too much with HTML, so every so often I wind up in a situation where I have to reach back and reconstruct things that once were entirely well known to me. Today, I wound up talking with someone about the <pre> element and what you could and couldn't safely put in it, and it took some time to remember most of the details.

The simple version is that <pre> doesn't escape markup, it only changes formatting, although many simple examples you'll see only use it on plain text so it's not immediately clear. Although it would be nice if <pre> was a general container that you could pour almost arbitrary text into and have it escaped, it's not. If you're writing HTML by hand and you have something to put into a <pre>, you need to escape any markup and HTML entities (much like a <textarea>, although even more so). Alternately, you can actually use this to write <pre> blocks that contain markup, for example links or text emphasis (you might deliberately use bold inside a <pre> to denote generic placeholders that the reader fills in with their specifics).

As with <textarea>, it's easy to overlook this for straightforward cases and to get away without doing any text escaping, especially in modern browsers. A lot of the command lines or code or whatever that we often put into <pre> don't contain things that can be mistaken for HTML markup or HTML entities, and modern browsers will often silently re-interpret things as plain text for you if they aren't validly formatted entities or markup. I myself have written and altered any number of <pre> blocks over the past few years without ever thinking about it, and I'm sure that some of them included '<' or '>' and perhaps '&' (all as part of Unix command lines).

(The MDN page on <pre> includes an example with unescaped < and >. If you play around with similar cases, you'll probably find that what is rendered intact and what is considered to be an unrecognized HTML element that is silently swallowed is quite sensitive to details of formatting and what is included within the '< ... >' run of raw text. Browsers clearly have a lot of heuristics here, some of which have been captured in HTML5's description of tag open state. In HTML5, anything other than an ASCII alpha after the '<' makes it a non-element (in any context, not just in a <pre>).)

I don't know how browser interpretation of various oddities in <pre> content is affected by the declared or assumed HTML DOCTYPE or HTML version the browser assumes, but I wouldn't count on all of them behaving the same outside, perhaps, of HTML5 mode (which at least has specific rules for this). Of course if you're producing HTML with tools instead of writing it by hand, the tools should take care of this for you. That's the only reason that Wandering Thoughts has whatever HTML correctness it does; my DWikiText to HTML rendering code takes care of it all for me, <pre> blocks included.

web/PreDoesVeryLittle written at 21:03:51; Add Comment

Go channels work best for unidirectional communication, not things with replies

Once, several years ago, I wrote some Go code that needed to manipulate a shared data structure. At this time I had written and read less Go code than I have now, and so I started out by trying to use channels and goroutines for this. There would be one goroutine that directly manipulated the data structure; everyone else would ask it to do things over channels. Very rapidly this failed and I wound up using mutexes.

(The pattern I tried is what I have since seen called a monitor goroutine (via).)

Since then, I have come to feel that this is one regrettable weakness of Go channels. However nice, useful, and convenient they are for certain sorts of communication patterns, Go channels do not give you very good ways of implementing a 'RPC' communication pattern, where you make a request of another goroutine and expect to get an answer back, since there is no direct way to reply to a channel message. In order to be able to reply to the sender, your monitor goroutine must receive a unique reply channel as part of the incoming request, and then things can start getting much more complicated and tangled from there (with various interesting failure modes if anyone ever makes a programming mistake; for example, you really want to insist that all reply channels are buffered).

My current view is that Go channels work best for unidirectional communication, where either you don't need an answer to the message you've sent or it doesn't matter which goroutine in particular receives and processes the 'reply' (really the next step), so you can use a single shared channel that everyone pulls messages from. Implementing some sort of bidirectional communication between specific goroutines with channels is generally going to be painful and require a bunch of bureaucracy that will complicate your code (unless all of the goroutines are long-lived and have communication patterns that can be set up once and then left alone). This makes the "monitor goroutine" pattern a bad idea simply for code clarity reasons, never mind anything else like performance or memory churn.

(This is especially the case if you have a bunch of different requests to send to the one goroutine, each of which can get a different reply, because then you need a bunch of different channel types unless you're going to smash everything together in various less and less type-safe ways. The more methods you would implement on your shared data structure, the more painful doing everything through a monitor goroutine will be.)

I'm not sure there's anything that Go could do to change this, and it's not clear to me that Go should. Go is generally fairly honest about the costs of operations, and using channels for synchronization is more expensive than a mutex and probably always will be. If you have a case where a mutex is good enough, and a shared data structure is a great case, you really should stick with simple and clearly correct code; that it performs well is a bonus. Channels aren't the answer to everything and shouldn't try to be.

(Years ago I wrote Goroutines versus other concurrency handling options in Go about much the same issues, but my thinking about what goroutines were good and bad at was much less developed then.)

(This entry was sparked by reading Golang: Concurrency: Monitors and Mutexes, A (light) Survey, because it made me start thinking about why the "monitor goroutine" pattern is such an awkward one in Go.)

programming/GoChannelsAndReplies written at 01:02:59; Add Comment

Page tools: See As Normal.
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.