2016-06-15
How (some) syndication feed readers deal with HTTP to HTTPS redirections
It's now been a bit over a year since Wandering Thoughts switched from HTTP to HTTPS, of course with a pragmatic permanent redirect from the HTTP version to the HTTPS version. In theory syndication feed readers should notice permanent HTTP redirections and update their feed fetching information to just get the feed from its new location (although there are downsides to doing this too rapidly).
In practice, apparently, not so much. Looking at yesterday's stats from this server, there are 6,700 HTTP requests for Atom feeds from 520 different IP addresses. Right away we can tell that a number of these IPs made a lot of requests, so they're clearly not updating their feed source information. Out of those IPs, 30 of them did not make HTTPS requests for my Atom feeds; in other words, they didn't even follow the redirection, much less update their feed source information. The good news is that these IPs are only responsible for 102 feed fetch attempts, and that a decent number of these come from Googlebot, Google Feedfetcher (yes, still), and another web spider of uncertain providence and intentions. The bad news is that this appears to include some real syndication feed readers, based on their user agents, including Planet Sysadmin (which is using 'Planet/1.0'), Superfeedr, and Feedly.
The IPs that did at least seem to follow the HTTP redirection have a pretty wide variety of user agents. The good or bad news is that this includes a number of popular syndication feed readers. It's good that they're at least following the HTTP redirection, but it's bad that they're both popular and not updating feed source information after over a year of permanent HTTP redirections. Some of these feed readers include CommaFeed, NewsBlur, NetNewsWire, rss2email, SlackBot, newsbeuter, Feedbin, Digg's Feed Fetcher, Gwene, Akregator, and Tiny Tiny RSS (which has given me some heartburn before). Really, I think it's safer to assume that basically no feed readers ever update their feed source information on HTTP redirections.
As it turns out, the list of user agents here comes with a caveat. See the sidebar.
(Since it's been more than a year, I have no way to tell how many feed readers did update their feed source information. Some of the people directly fetching the HTTPS feeds may have updated, but certainly at least some of them are new subscribers I've picked up over the past year.)
At one level, this failure to update the feed source is harmless; the HTTP to HTTPS redirection here can and will continue basically forever without any problems. At another level it worries me, both for Wandering Thoughts and for blogs in general, because very few things on the web are forever and anything that makes it harder to move blogs around is worth concern. Blogs do move, and very few are going to be able to have a trail of HTTP redirections that lives forever.
(Of course the really brave way to move a blog is to just start a new one and announce it on the old one. That way it takes active interest for people to keep reading you; you'll lose the ones who aren't actually reading more (but haven't removed you from their feed reader) and the ones who decide they're not interested enough.)
Sidebar: Some imprecision in these results
Without more work than I'm willing to put in, I can't tell when a HTTPS request from a given IP is made due to following a redirection from a HTTP request. All I can say is that an IP address that made one or more HTTP requests also made some HTTPS requests. I did some spot checks (correlating the times of some requests from specific IPs) and they did look like HTTP redirections being followed, but this is far from complete.
The most likely place where I'd be missing a feed reader that doesn't follow redirections is shared feed reader services (ie, replacements for Google Reader). There it would be easy for one person to have imported the HTTP version of my feed and another person to have added the HTTPS version later, quite likely causing the same IP fetching both HTTPS and HTTP versions of my feed and leading me to conclude that it did follow the redirection.
I have some evidence that there is some amount of this sort of feed duplication, because as far as I can tell I see more HTTPS requests from these IPs than I do HTTP ones. Assuming my shell commands based analysis is correct, I see a number of cases where per-IP request counts are different, in both directions (more HTTPS than HTTP, more HTTP than HTTPS).
(This is where it would be really useful to be able to pull all of these Apache logs into a SQL database in some structured form so I could sophisticated ad-hoc queries, instead of trying to do it with hacky, awkward shell commands that aren't really built for this.)
ZFS on Linux has just fixed a long standing little annoyance
I've now been running ZFS on Linux for a while.
Over that time, one of the small little annoyances of the ZoL
experience has been that all ZFS commands required you to be root,
even if all you wanted to do was something innocuous like 'zpool
status
' or 'zfs list
'. This wasn't for any particularly good
reason and it's not how Solaris and Illumos behave; it was just
necessary because the ZoL kernel code itself had no permissions
restrictions on anything for complicated porting reasons. Anyone
who could talk to /dev/zfs
could do any ZFS operation, including
dangerous and destructive ones, so it had to be restricted to root
.
Like many people running ZoL, I dealt with this in a straightforward
way. To wit, I set up a /etc/sudoers.d/02-zfs
file that allowed
no-password access to a great big list of ZFS commands that are
unprivileged on Solaris, and then I got used to typing things like
'sudo zpool status
'. But this was never a really great experience
and it's always been a niggling annoyance.
I'm happy to report that as of a week or so ago, the latest development
version of ZoL now has fixed this issue. Normal non-root users can
now run all of the ZFS commands that are unprivileged on Solaris.
As part of this, ZoL now supports normal ZFS 'zfs allow
' and 'zfs
unallow
' for most operations, so you can (if desired) allow yourself
or other normal users to do things like create snapshots.
(Interestingly, poking around at this caused me to re-discover that
'zpool history
' is a privileged operation even on Solaris. I
guess some bits of my sudoers file are going to stay.)
Things like this are part of why I've been pretty happy to run the development version of ZoL. Even the development version has been pretty stable, and it means that I've gotten a fair number of interesting and nice features well before they made it into one of the infrequent ZoL releases. I don't know how many people run the development version, but my impression is that it's not uncommon.
(I can't blame the ZoL people for the infrequent releases, because they want releases to be high quality. Making high quality releases is a bunch of work and takes careful testing. Plus sometimes the development tree has known outstanding issues that people want to fix before a release. (I won't point you at the ZoL Github issues to see this, because there's a fair amount of noise in them.))