How (some) syndication feed readers deal with HTTP to HTTPS redirections
It's now been a bit over a year since Wandering Thoughts switched from HTTP to HTTPS, of course with a pragmatic permanent redirect from the HTTP version to the HTTPS version. In theory syndication feed readers should notice permanent HTTP redirections and update their feed fetching information to just get the feed from its new location (although there are downsides to doing this too rapidly).
In practice, apparently, not so much. Looking at yesterday's stats from this server, there are 6,700 HTTP requests for Atom feeds from 520 different IP addresses. Right away we can tell that a number of these IPs made a lot of requests, so they're clearly not updating their feed source information. Out of those IPs, 30 of them did not make HTTPS requests for my Atom feeds; in other words, they didn't even follow the redirection, much less update their feed source information. The good news is that these IPs are only responsible for 102 feed fetch attempts, and that a decent number of these come from Googlebot, Google Feedfetcher (yes, still), and another web spider of uncertain providence and intentions. The bad news is that this appears to include some real syndication feed readers, based on their user agents, including Planet Sysadmin (which is using 'Planet/1.0'), Superfeedr, and Feedly.
The IPs that did at least seem to follow the HTTP redirection have a pretty wide variety of user agents. The good or bad news is that this includes a number of popular syndication feed readers. It's good that they're at least following the HTTP redirection, but it's bad that they're both popular and not updating feed source information after over a year of permanent HTTP redirections. Some of these feed readers include CommaFeed, NewsBlur, NetNewsWire, rss2email, SlackBot, newsbeuter, Feedbin, Digg's Feed Fetcher, Gwene, Akregator, and Tiny Tiny RSS (which has given me some heartburn before). Really, I think it's safer to assume that basically no feed readers ever update their feed source information on HTTP redirections.
As it turns out, the list of user agents here comes with a caveat. See the sidebar.
(Since it's been more than a year, I have no way to tell how many feed readers did update their feed source information. Some of the people directly fetching the HTTPS feeds may have updated, but certainly at least some of them are new subscribers I've picked up over the past year.)
At one level, this failure to update the feed source is harmless; the HTTP to HTTPS redirection here can and will continue basically forever without any problems. At another level it worries me, both for Wandering Thoughts and for blogs in general, because very few things on the web are forever and anything that makes it harder to move blogs around is worth concern. Blogs do move, and very few are going to be able to have a trail of HTTP redirections that lives forever.
(Of course the really brave way to move a blog is to just start a new one and announce it on the old one. That way it takes active interest for people to keep reading you; you'll lose the ones who aren't actually reading more (but haven't removed you from their feed reader) and the ones who decide they're not interested enough.)
Sidebar: Some imprecision in these results
Without more work than I'm willing to put in, I can't tell when a HTTPS request from a given IP is made due to following a redirection from a HTTP request. All I can say is that an IP address that made one or more HTTP requests also made some HTTPS requests. I did some spot checks (correlating the times of some requests from specific IPs) and they did look like HTTP redirections being followed, but this is far from complete.
The most likely place where I'd be missing a feed reader that doesn't follow redirections is shared feed reader services (ie, replacements for Google Reader). There it would be easy for one person to have imported the HTTP version of my feed and another person to have added the HTTPS version later, quite likely causing the same IP fetching both HTTPS and HTTP versions of my feed and leading me to conclude that it did follow the redirection.
I have some evidence that there is some amount of this sort of feed duplication, because as far as I can tell I see more HTTPS requests from these IPs than I do HTTP ones. Assuming my shell commands based analysis is correct, I see a number of cases where per-IP request counts are different, in both directions (more HTTPS than HTTP, more HTTP than HTTPS).
(This is where it would be really useful to be able to pull all of these Apache logs into a SQL database in some structured form so I could sophisticated ad-hoc queries, instead of trying to do it with hacky, awkward shell commands that aren't really built for this.)