HTTP as it is seen in the wild

December 27, 2006

Out of a somewhat idle curiosity, I decided to do up some numbers for actual HTTP requests against one of the servers here. All of this is using the past 28 days of old logs (plus today's):

289160 total requests
277323 GET
  5722 PROPFIND
  3665 OPTIONS
  2215 POST
   178 HEAD
    39 CONNECT
    18 garbled

Most of the requests were successful; 90% got a 2xx or a 3xx response. 55,246 (21%) of the GET requests were successful conditional GETs, out of 256,392 successful GETs; I'm not sure whether to consider this good or bad.

(Unfortunately I don't have enough information to find out how many requests were willing to accept gzip'd results.)

The popularity of PROPFIND and OPTIONS surprised me, but almost all of them turn out to be from just three external IPs, with the lion's share coming from just one. Most of the OPTIONS requests were to /, and most of the PROPFIND requests were to the (nonexistent) /LJF4100, so I suspect that someone's machine is badly misconfigured.

The majority of the HEAD requests were for /, with my Atom syndication feed being the somewhat distant runner-up. Requests came from all over with nothing clearly dominating the results.

(From this I conclude that optimizing HEAD is not really a high priority, which is good because DWiki doesn't.)

HTTP/1.0 dominated over HTTP/1.1, about 67% to 33%; no one is still making pre-HTTP/1.0 requests. (Apart from our very primitive monitoring system, which I am ignoring for this.)

A small number of apparently legitimate people made requests with full 'http://...' URLs (theoretically only usable against proxies; 396 requests in total). To my surprise, a full third of them used HTTP/1.0; the rest used HTTP/1.1.

Requests came from 11,745 different IP addresses. The average number of requests per IP was 24.6, but the median was only 3 (and the mode was 1 request, which does not surprise me). A surprisingly large number of the IPs that made only one request asked for robots.txt (although it was not the most popular such request). As usual, the most active visitor was our internal search engine.

Sidebar: POST targets

This server (currently) hosts CSpace (and thus WanderingThoughts), which is what the majority of the POST requests were directed against (1,299 out of the 2,215; I get a fair number of comment spam attempts). A small number of the remainder (126) were legitimate; the rest were bad in various ways, ranging from repeatedly poking nonexistent URLs to various XML RPC exploit attempts (and one mysterious POST to /).

The most popular POST target was the nonexistent URL path /officescan/cgi/cgiRecvFile.exe, followed by my Recent Comments page.

Sidebar: the breakdown of responses

Distribution of HTTP response codes:

201807 2xx
       199273 200
         2534 206
 59814 3xx
        55246 304
         4106 301
          459 302
 27506 4xx
        13494 404
         8018 403
         5756 405
          234 400
            2 401
            1 416
            1 414
    30 5xx

Some of the 404'd URLs are fairly popular, but I'm not going to try to read the tea leaves about that.

Written on 27 December 2006.
« Link: OpenBSD spamd
Solaris 8 DiskSuite's lack of good monitoring »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Dec 27 00:32:24 2006
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.