|
2009-07-28 Spammers are quite dedicated in their address scrapingThis is one of those entries that require some apparently irrelevant background. The Atom syndication feed format requires that each entry have a unique
identifier assigned to it (the While you can read all the gory details here,
the simple version of
The authorityName is normally a domain; however, the spec says that you can use
'<id>@<domain>' as well. For reasons beyond the scope of this entry,
I decided to use the second format for the (In brief: the advantage of this format is that you don't have to invent a new subdomain for everything you host; you use one domain and have a unique identifier as the <id> bit.) You can see where this is going. A bit over a month after I started using this format for Atom IDs, I started getting email attempts to 'cspace@<domain>' (which were rejected; there is no requirement that such authorityNames actually are email addresses, and the domain I used doesn't even accept email to start with). After talking about this with some people, the general speculation is
not that spammers are scraping Atom feeds for
|
These are my WanderingThoughts GettingAround This is part of CSpace, and is written by ChrisSiebenmann. * * * Atom feeds are available; see the bottom of most pages. Categories: links, linux, programming, python, snark, solaris, spam, sysadmin, tech, unix, web |