Wandering Thoughts archives


All syndication formats use XML

I am pretty sure that this isn't the first time that I've seen people be grumpy about Atom because it's an XML-based format. Unfortunately, I have bad news for such people; to put it one way, it's XML all the way down.

More directly, all syndication feed formats, RSS's many variants as well as Atom, are XML-based (including the versions of RSS that are based on RDF, since the RDF used is XML-based). This is not just at a light structural level in RSS's case; you can routinely find RSS feeds that have <![CDATA[...]]> sections and other significant XML-isms that cannot just be treated as text (or HTML) inside elements that you strip off with a regexp.

Equally, all syndication formats are not XML in real life, in that attempting to parse any format with a strict XML parser will not infrequently give you errors (cf this comic). This is not even considering using a validating parser that actually checks the relevant syndication format specification (you can see how your favorite feeds would score at feedvalidator.org). In practice you can produce any syndication feed format with string bashing and have it consumed, despite errors, by most feed readers.

(Actually, I don't know for sure that Google Reader accepts invalid syndication feeds. I'd expect it to, but one can never be sure; online aggregators have been surprisingly picky in the past.)

My overall opinion of the relative merits of Atom and RSS remains unchanged. However, there's little reason to switch if RSS meets your needs and doesn't cause problems; feed readers, aggregators, and so on are going to support both for the indefinite future.

tech/RSSisXML written at 01:49:55; Add Comment

Page tools: See As Normal.
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.