Exploring some spamblogs
I have a certain interest in the behavior of MSNbot, the MSN Search web spider. I'd like to keep track of what other people are saying about it in blogs; the obvious way is a date-based [msnbot] search using Google Blogsearch.
If you do the search you can see why the results leave me less than enthused: it is full of a cluster of spamblogs, mostly hosted at blogspot.com. They show up because they're mechanically including articles about search engine behavior and search engine optimization that they appear to pull from ezinearticles.com, most of which appear to have originally been written by Mike Banks Valentine of website101.com (for example, this article has been quite popular).
If we look at a representative posting, we can see that threaded through the web page are images and carefully keyworded captions that link to redirectors under 'clickbank.net' or on 'tietie.ru'; which one is used seems to depend on the page. (Also present are links to other spamblogs in the cluster, URLs from the original articles, and a few outbound links that may be attempts to persuade Google that they're not spamblogs.)
The images are common across all of the blogs and appear to be stock photos fetched from 'static.sxc.hu', which bills itself as 'the leading free stock photo site'. It's not clear why the spammers use images; they may be attempting to hit Google Image searches too, or maybe Google rates words in image captions higher than otherwise.
Clickbank.net is 'Click Sales Inc', with a primary website at clickbank.com; they seem to be a merchant backend for e-books, software, and other purely digital products. They offer charming services such as having their '100,000 affiliates' drive traffic to your website, and seem to be popular with people who sell things like '33 Days to Online Profits 2004 Edition'. (They also seem popular with people who spam Usenet and Google Groups.)
The tietie.ru URLs are just redirectors to the clickbank.net URLs. I'm not sure why the spammers want to cloak the presence of clickbank.net URLs, but evidently they do.
Another form of Google Blogsearch spam is all of the keywordblogger.net subdomains that show up in the [msnbot] search. While keywordblogger.net (aka pre-views.net, aka preview-search.com) is nominally in the blog searching and indexing business, their real purpose is to generate ad revenues for themselves (ironically including through Google Adwords) by drawing visitors to pages that are loaded with ads and stuff.
Keywordblogger.net seems to operate by copying entries from syndication feeds, 'indexing' them to find various common words like 'database' or 'website', and then re-presenting the search and indexing results as pseudo-blogs in subdomains that they then get Google Blogsearch to index. The syndication feeds from these pseudo-blogs then draw readers to keywordblogger.net web pages full of ads (unlike an honest blog aggregator, their RSS feeds don't point to the original URL for the entries).
You can see how little importance they attach to the real blog entries by looking at how they're presented on the web pages: in plain text in small blue type on a gray background, well down the page past all of the ads.
(Presumably keywordblogger.net is going to all of this effort so that they can say that they are a blog search company, and 'just' running ads like all of the rest. I can hope that this is not going to fool Google.)
Update: they're also keywordblogger.com and show up under that name in some Google Blogsearch searches.
Solaris 9 'Power management'
I had another Solaris 9 learning experience today: I came in to find my ssh sessions to my Ultra 10 test machine dead, because the machine was powered off. This was more than a little bit disconcerting, since the last thing I had left it doing was installing the current Solaris 9 patch set. (It took sufficiently long that I'd had to go home before it finished.)
Powering the machine on showed not a normal boot sequence, but a message about restoring the system. This caused me to remember that when I had installed the system, I'd said yes to an offer to have 'power management' software installed. (Unfortunately the installer does not have very many clear explanations of what the software packages all do.)
In the PC world I usually operate in, 'power management' is things like spinning down disks and dropping into low-power CPU states when the machine is idle. In the SPARC world, it turns out that 'power management' is turning the machine off entirely.
Fortunately I was able to find the
dtpower program after some quick
dtpower doesn't run over a ssh X connection
for some reason, so I had to fire up
dtlogin, log in, and run it to
shut this feature off. (There is probably a way to fire up the X
server and the environment from a console login, but starting
dtlogin was faster than trying to figure it out.)
(This whole episode is my fault, not Solaris 9's. I should have read the documentation before firing up the installer, and certainly before answering installer questions I didn't fully understand. But at least I've stubbed my toe on this now, in case I ever have to deal with Sparcs that mysteriously power themselves off every so often.)