Thinking about how to tame the interaction of conditional GET and caching

November 20, 2024

Due to how I do caching here, Wandering Thoughts has a long standing weird HTTP behavioral quirk where a non-conditional GET for a syndication feed here can get a different answer than a conditional GET. One (technical) way to explain this issue is that the cache validity interval for non-conditional GETs is longer than the cache validity interval for conditional GETs. In theory this could be the complete explanation of the issue, but in practice there's another part to it, which is that DWiki doesn't automatically insert responses into the cache on a cache miss.

(The cache is normally only filled for responses that were slow to generate, either due to load or because they're expensive. Otherwise I would rather dynamically generate the latest version of something and not clutter up cache space.)

There are various paths that I could take, but which ones I want to take depends on what my goals are and I'm actually not entirely certain about that. If my goal is to serve responses to unconditional GETs that are as fresh as possible but come from cache for as long as possible, what I should probably do is make conditional GETs update the cache when the cached version of the feed exists and would still have been served to an unconditional GET. I've already paid the cost to dynamically generate the feed, so I might as well serve it to unconditional GET requests. However, in my current cache architecture this would have the side effect of causing conditional GETs to get that newly updated cached copy for the conditional GET cache validity period, instead of generating the very latest feed dynamically (what would happen today).

(A sleazy approach would be to backdate the newly updated cache entry by the conditional GET validity interval. My current code architecture doesn't allow for that, so I can avoid the temptation.)

On the other hand, the entire reason I have a different (and longer) cache validity interval for unconditional GET requests is that in some sense I want to punish them. It's a deliberate feature that unconditional GETs receive stale responses, and in some sense the more stale the response the better. Even though updating the cache with a current response I've already generated is in some sense free, doing it cuts against this goal, both in general and in specific. In practice, Wandering Thoughts sees frequent enough conditional GETs for syndication feeds that making conditional GETs refresh the cached feed would effectively collapse the two cache validity intervals into one, which I can already do without any code changes. So if this is my main goal for cache handling of unconditional GETs of my syndication feed, the current state is probably fine and there's nothing to fix.

(A very approximate number is that about 15% of the syndication feed requests to Wandering Thoughts are unconditional GETs. Some of the offenders should definitely know and do better, such as 'Slackbot 1.0'.)

Written on 20 November 2024.
« Two API styles of doing special things involving text in UIs
Our Prometheus alerting problem if our central mail server isn't working »

Page tools: View Source.
Search:
Login: Password:

Last modified: Wed Nov 20 22:41:53 2024
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.