Turning over a rock on some weird HTTP requests to our web server

February 29, 2016

I recently made the mistake of looking at our Apache access.log, and in fact watching it live with 'tail -f'. Me being me, I can't just let what I saw sit quietly, so now I'm here to tell you about the big weirdness I saw. Put simply, it was a whole rapid burst of requests that looked like:

IP - - [28/Feb/2016:17:18:38 -0500] "GET /mmievslc.txt HTTP/1.1" 404 [...]
IP - - [28/Feb/2016:17:18:39 -0500] "GET /mmievslc.txt HTTP/1.1" 404 [...]
IP - - [28/Feb/2016:17:18:39 -0500] "GET /mmievslc.txt HTTP/1.1" 404 [...]

When I started digging, I saw multiple IPs making requests like this for multiple different 8-character .txt URLs in the root of our web server (none of which have ever existed). On random spot checks, they almost all happen in bursts (although there can be pauses), and there are a lot of them.

How many? Yesterday, we saw 34,500 such requests (about 10% of the total HTTP requests), from 116 different IPs and for 122 different names. The top three IPs all made over 1000 requests each; the median made 233 requests. Every such request had the same user-agent:

"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"

(Only 50 other requests from 13 different IPs used this user-agent.)

On a random spot check of IP addresses doing this, I can't find any that aren't in China. Some but not many of the IP addresses are listed in things like the SBL; others claim to be entirely clean in my blocklist checks.

I spot-checked the IPs doing this yesterday against the IPs doing this today and about two thirds of them are different; checking yesterday against the day before yielded the same result. So there seems to be a different set of sources doing this over time.

We have multiple virtual hosts on this web server, and only two of them are affected; the main departmental web server name and another one (which saw far less volume of these requests). There's nothing obvious that's different between unaffected hosts and affected ones.

And what makes this really mysterious is I have no idea what these requests are supposed to accomplish. Are they an attack of some sort? Are they an accidental side effect of other software? Are they being done deliberately in order to create some sort of useful side effect? Are they traffic cloaking or obfuscation of some sort? Who knows. I may have turned over this rock, but I have no idea how to understand what's scuttling around underneath it.

Written on 29 February 2016.
« Sometimes, doing a bunch of programming can be the right answer
Some notes on OpenSSH's optional hostname canonicalization »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Mon Feb 29 23:02:44 2016
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.