== What your _User-Agent_ header should include and why I wound up [[having a discussion about this https://github.com/swanson/stringer/issues/278]] in the context of a feed reader and it caused me to have a realization or two, so I've decided to write up my views on this. All of this is mostly from the perspective of a website operator; there are other ones. There are three different cases: when you are writing a user agent, when you are writing a web robot, and when you are writing a web robot library (which will be used by possibly many web robot operators). The easiest case is when you're writing a client that will be directly used by real people. Here your _User-Agent_ should identify the software by name and by a URL to your project site and give a general version number. It should *not* identify the user, either directly by name or indirectly by including additional client fingerprint information such as the platform it's running on. As a side note, your project site should include enough information to convince a suspicious website operator that it is a real client that gets used by real people. (Some people will object to the version number but I think it's important to include because it lets me either tell people to upgrade because the upgrade fixes a problem or tell you that your latest code has some problem. If you leave the version number out all I can possibly report to your project is 'some version of your software does this bad thing'.) This is completely different for web robots. For web robots the the _User-Agent_ header must contain a clear identification of both your robot and of who is responsible for its operation, ie the URL of a web page describing who you are, what you do, and so on. There should be readable English on the page and a method of contacting you privately (such as email or a contact form). It is vaguely customary to include a version number but as a website operator I don't care in the least; you might as well always use '/1.0' if you feel a version number is required. Including this information in your _User-Agent_ is to your benefit because it encourages website operators to investigate and perhaps report some crawling program instead of [[blocking you out of hand SpiderRobotsTxtHint]] (either by user-agent or by source IPs, or perhaps both). I have much harsher reactions to anonymous robots than I do to ones that are willing to identify themselves. Note that ~~if you're a company running software from your servers that is poking my websites, you're a robot operator~~. At one level I don't care exactly why you're running the software or how many users it is helping, I still expect it to identify the specific party responsible for itself. Fail to do this and I reach for the block tools. (And yes, this very much applies to feed reader aggregator sites.) If you're writing a web robot library you need to somehow force its users to add such a clear identification of themselves into the _User-Agent_ (although including your library's project URL is nice, it is *not* an identification of the responsible party for the robot that is hitting my site). I'd put this into the library's configuration as a mandatory field or make it an optional setting but with the default value of something like 'UNCONFIGURED, BLOCK THIS ROBOT'. Note that if you supply 'sensible' default values, many of your library's users will never change them. (If you're writing a web library for use by real clients I wouldn't bother having any default _User-Agent_ or putting your library's identification in. Just provide an API for supplying the user agent information and document what's a good idea to put in there. Make using the API mandatory because otherwise people won't. Putting your library information as well is okay and potentially useful, but your library information alone in the _User-Agent_ is completely useless to website operators because it tells us nothing about what is visiting.)