Google Feedfetcher is still fetching feeds and a User-Agent
caution
Feedfetcher was Google's feed fetching backend for Google Reader, which as you may remember was shut down on July 1st this year (to generally mixed feelings). At the time of that shutdown Google was pretty definite about how the service was gone, its data was not being retained, and there would be no recovery or resumption possible. One would normally expect that the feed fetching backend would also be shut down at the same time.
Well, no, of course not. This is Google, after all, the new home
of 'we don't care because we don't have to' (cf). Google Feedfetcher
is still pulling my feeds more than four months after the shutdown
of Google Reader. In fact it's worse than that; the claimed readership
numbers listed in its User-Agent
have barely budged from the time
when Google Reader was running (this is what is known as a flat out
lie). As irritating things involving Google go, this is a drop in
the bucket. Still I've recently decided that I've had enough so
I've blocked their user-agent. It turns out that this exposes a
little issue that you may want to think about when you create
User-Agent
strings.
Here is the User-Agent
header for Google Feedfetcher:
Feedfetcher-Google; (+http://www.google.com/feedfetcher.html; 445 subscribers; feed-id=1422824070729197911)
Here is the User-Agent
header for Feedly:
Feedly/1.0 (+http://www.feedly.com/fetcher.html; like FeedFetcher-Google)
If you block Google Feedfetcher using a case-independent match you'll
probably also block Feedly unless your User-Agent
parser is really
smart. It would be easy to miss this when you set up blocks unless you
make a habit of monitoring what they match (and I suspect that many
people don't do that, any more than they have a fancy User-Agent
parser instead of a general regexp engine).
By the way, if this happens I would argue that it is more or less
Feedly's fault here. There are quite a lot of feed fetchers that do not
feel the need to drop Google Feedfetcher's name in their User-Agent
header and the way that Feedly is doing this, combined with Google's own
User-Agent
formatting, makes it very easy for a match to hit both. If
Feedly wants to communicate the similarity to webmasters reading their
logs they could have used a different phrasing that would not run this
risk.
(Of course I rather suspect that Feedly actively wanted their feed fetcher to be mistaken for Google Feedfetcher by automated code, it's just that when they planned it this they expected that it was going to be a good thing.)
|
|