Browser feed reader addons don't seem to do very well on caching

December 18, 2024

Over on the Fediverse, I said something disappointing:

Browser addons for syndication feed fetching appear to be the absolute worst for frequent feed fetching and ignoring everything the server says about this. They'll ignore Cache-Control hints for Atom syndication feeds, ignore HTTP 429 errors, ignore any retry timing in said headers (not surprising), and keep trying every few minutes. I am sorely disappointed.

(Or at least I assume this is from addons, based on the user-agent strings.)

It's dangerous to assume too much from HTTP user agent strings in this day and age, but many of the user-agent strings that exhibit this behavior are plausible browser ones, often for current versions of the browser, and they often come from what appear to be 'end user' IP addresses, instead of things like cloud server IPs. Firefox is the dominant browser represented in these user-agent strings, although Chrome and Safari also show up; however, there are lots of possible explanations for this, including that perhaps RSS addons are more popular in Firefox than in other browsers.

(If I was an energetic person like rachelbythebay I would try out a bunch of feed reader addons for Firefox to try to identify the flawed ones. I'm not that energetic.)

You'd certainly hope that browser feed reader addons would benefit from general browser cache management and so on, but apparently not very much. Some addons appear to be at least managing conditional requests even if they don't respect feed fetching timing information exposed in Cache-Control headers, but other sources don't even manage that, and will hammer Wandering Thoughts with unconditional GET requests every few minutes. I don't think any of them react to HTTP 429 responses, or at least if they do it's drowned out by all of the ones that clearly don't (some of them have been getting 429s for an extended length of time and are still showing up every few minutes).

I don't know to what extent this is simply coding decisions in the addons and to what extent it's that browser APIs don't necessarily make it easy to do the right thing. However, it appears that the modern fetch() API defaults to respecting 'Cache-Control: max-age=...' information, although perhaps addon authors are forcing either the no-cache or the reload cache mode. If I understand things right, the no-cache mode would create the constant flood of conditional GET requests, while the 'reload' mode would create the constant unconditional GET requests.

(I don't know if there's any browser API support that more or less automatically handles a Retry-After header value on HTTP 429 errors, or if addons would have to do that entirely themselves (which means that they most likely don't).)

PS: It's possible to do this correctly even with very basic tools, such as curl, as covered in MacKenzie's Fetching RSS Feeds Respectfully With curl (via the author emailing me, which was great, it's a nice piece of work).

Written on 18 December 2024.
« We have an unusual concern when we use Let's Encrypt
Short lived TLS certificates and graceful rollover in web servers »

Page tools: View Source.
Search:
Login: Password:

Last modified: Wed Dec 18 22:41:36 2024
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.