== How well do some Atom feed fetchers do conditional GETs? '[[Conditional GET|http://fishbowl.pastiche.org/2002/10/21/http_conditional_get_for_rss_hackers]]' is the HTTP technique used to save bandwidth by not re-fetching unchanged pages. Using conditional GET is especially important for things that fetch syndication feeds (RSS or Atom), because people usually check feeds much more often than they revisit web pages. ([[This|http://www.kbcafe.com/rss/rssfeedstate.html#httpisnotstateless]] is another good reference for syndication feed reader authors.) WanderingThoughts has [[a lot of syndication feeds|ADynamicSitePeril]] and the main ones are quite big. Recently, partly prompted by [[issues with MSNbot|MSNbotCrazyRSSBehavior]], I decided to take a look at what was fetching my syndication feeds and how well they did conditional GET. So I looked at my data for about the past week, chosen in part because I recently added detailed logging about what conditional GET related headers get sent by things fetching my Atom feeds. (First, I have to say that I like having readers and we have a lot of spare bandwidth. If your syndication feed reader does badly here, it is absolutely not a request for you to unsubscribe.) Conditional GET can be done with _ETag_ / _If-None-Match_, or with _If-Modified-Since_; _ETag_ is better. Perfect scores go to the feed fetchers that always use it: [[SharpReader|http://www.sharpreader.net/]], [[Bloglines|http://www.bloglines.com/]], [[LiveJournal|http://www.livejournal.com]], [[Feedster Crawler|http://www.feedster.com/]], and [[NetNewsWire|http://ranchero.com/netnewswire/]]. A few feed fetchers lose some points from the East German judge: * [[liferea|http://liferea.sf.net/]] lost out on a perfect score because while it always uses _If-Modified-Since_, it only sometimes uses _If-None-Match_ (only if it's fetched a changed feed since the program was started; it doesn't store the _ETag_ value in its persistent database). * [[Yahoo Slurp|http://help.yahoo.com/help/us/ysearch/slurp]] and [[PubSub-RSS-Reader|http://www.pubsub.com/]] only use _If-Modified-Since_, which works but is not ideal. The 'nice try, but...' award goes to: * [[Rojo 1.0|http://www.rojonetworks.com/Rob]], who support _ETag_ but unfortunately make up their own timestamps for _If-Modified-Since_, and send both headers. This doesn't work, for reasons explained [[here|IfModSinceTimestampProblem]] and [[here|ETagAndIfModSinceInteraction]]. * [[BlogSearch|http://www.icerocket.com/]], which sends _If-None-Match_ but stripped of the quotes that DWiki's _ETag_ value has. (This may be RFC-compliant, in which case I need to fix DWiki.) A number of syndication feed fetchers don't support conditional GET; they don't even bother to send _If-Modified-Since_ headers, and always wind up re-fetching my syndication feeds (when they fetch the main one, this is 300K or so a shot). They are: * everyone's friend [[MSNbot|MSNbotCrazyRSSBehavior]], who is by far the most active fetcher of my Atom feeds. * 'madicon RSS Reader', which appears to be a syndication feed reader addon for Lotus Notes. Working in the Notes environment may make it difficult to store the per-feed information necessary to support conditional GET. * '[[Waggr_Fetcher)|]]', http://www.waggr.com/; this appears to be a web-based feed reader. * [[kinjabot|http://www.kinja.com]], another web-based aggregator thing. * [[FeedFetcher-Google|http://www.google.com/feedfetcher.html]] and 'Googlebot/2.1' (fetching as a browser); these surprised me, because I expected Google to do better. * [[BlogPulse|http://www.blogpulse.com/]], although to be fair it only visited three times in the last week. (It's an interesting blog search engine; I wish it indexed WanderingThoughts more. Unfortunately they want an email address to submit blog URLs, which is an immediate turnoff these days.)