2014-11-22
The effects of a moderate Hacker News link to here
A few days ago my entry on Intel screwing up their DC S3500 SSDs was posted to Hacker News here and rose moderately highly up the rankings, although I don't think it made the front page (I saw it on the second page at one point). Fulfilling an old promise, here's a report of what the resulting traffic volume looked like.
First, some crude numbers from this Monday onwards for HTTP requests for Wandering Thoughts, excluding Atom feed requests. As a simple measurement of how many new people visited, I've counted unique IPs fetching my CSS file. So the numbers:
| (day) | (that entry) | (other pages) | (CSS fetches) |
| November 17 | 0 | 5041 | 453 |
| November 18 | 18255 | 6178 | 13585 |
| November 19 | 17112 | 10141 | 11940 |
| November 20 | 908 | 6341 | 876 |
| November 21 | 228 | 4811 | 530 |
(Some amount of my regular traffic is robots and some of it is from regular visitors who already have my CSS file cached and don't re-fetch it.)
Right away I can say that it looks like people spilled over from the directly linked entry to other parts of Wandering Thoughts. The logs suggest that this mostly went to the blog's main page and my entry on our OmniOS fileservers, which was linked to in the entry (much less traffic went to my entry explaining why 4K disks can be a problem for ZFS). Traffic for the immediately preceding and following entries also went up, pretty much as I'd expect, but this is nowhere near all of the extra traffic so people clearly did explore around Wandering Thoughts to some extent.
Per-day request breakdowns are less interesting for load than per minute or even per second breakdowns. At peak activity, I was seeing six to nine requests for the entry per second and I hit 150 requests for it a minute (for only one minute). The activity peak came very shortly after I started getting any significant volume of hits; things start heating up around 18:24 on the 18th, go over 100 views a minute at 18:40, peak at 19:03, and then by 20:00 or so I'm back down to 50 a minute. Unfortunately I don't have latency figures for DWiki so I don't know for sure how well it responded while under this load.
(Total page views on the blog go higher than this but track the same activity curve. CSpace as a whole was over 100 requests a minute by 18:39 and peaked at 167 requests at 19:05.)
The most surprising thing to me is the amount of extra traffic to things other than that entry on the 19th. Before this happened I would have (and did) predict a much more concentrated load profile, with almost all of the traffic going to the directly linked entry. This is certainly the initial pattern on the 18th, but then something clearly changed.
(I was surprised by the total amount of traffic and how many people seem to have visited but that's just on a personal basis where it's surprising for so many people to be interested in looking at something I've written.)
This set of stats may well still leave people with questions. If so, let me know and I'll see if I can answer them. Right now I've stared at enough Apache logs for one day and I've run out of things to say, so I'm stopping this entry here.
Sidebar: HTTP Referers
HTTP Referers for that entry over the 18th to the 20th are kind of interesting. There were 17,508 requests with an empty Referer, 13,908 from the HTTPS Hacker News front page, 592 from a google.co.uk redirector of some sort, 314 from the t.co link in this HN repeater tweet, and then we're down to a longer tail (including reddit's /r/sysadmin, where it was also posted). The Referers feature a bunch of various alternate interfaces and apps for Hacker News and so on (pipes.yahoo.com was surprisingly popular). Note that there were basically no Referers from any Hacker News page except the front page, despite that as far as I know the story never made it to the front page. I don't have an explanation for this.
2014-11-17
Why I need a browser that's willing to accept bad TLS certificates
One of my peculiarities is that I absolutely need a browser that's willing to accept 'bad' TLS certificates, probably for all species of bad that you can imagine: mismatched host names, expired certificates, self-signed or signed by an unknown certificate authority, or some combination of these. There are not so much two reasons for this as two levels of the explanation.
The direct reason is easy to state: lights out management processors. Any decent one supports HTTPS (and you really want to use it), but we absolutely cannot give them real TLS certificates because they all live on internal domain names and we're not going to change that. Even if we could get proper TLS certificates for them somehow, the cost is prohibitive since we have a fair number of LOMs.
(Our ability to get free certificates has gone away for complicated reasons.)
But in theory there's a workaround for that. We could create our own certificate authority, add it as a trust root, and then issue our own properly signed LOM certificates (all our LOMs accept us giving them new certificates). This would reduce the problem to doing an initial certificate load in some hacked up environment that accepted the LOMs out-of-box bad certificate (or using another interface for it, if and where one exists).
The problem with this is that as far as I know, certificate authorities are too powerful. Our new LOM certificate authority should only be trusted for hosts in a very specific internal domain, but I don't believe there's any way to tell browsers to actually enforce that and refuse to accept TLS certificates it signs for any other domain. That makes it a loaded gun that we would have to guard exceedingly carefully, since it could be used to MITM any of our browsers for any or almost any HTTPS site we visit, even ones that have nothing to do with our LOMs. And I'm not willing to take that sort of a risk or try to run an internal CA that securely (partly because it would be a huge pain in practice).
So that's the indirect reason: certificate authorities are too powerful, so powerful that we can't safely use one for a limited purpose in a browser.
(I admit that we might not go to the bother of making our own CA and certificates even if we could, but at least it would be a realistic possibility and people could frown at us for not doing so.)
2014-11-10
Why I don't have a real profile picture anywhere
Recently I decided that I needed a non-default icon aka profile picture for my Twitter account. Although I have pictures of myself, I never considered using one; it's not something that I do. Mostly I don't set profile pictures on websites that ask for them and if I do, it's never actually a picture of me.
Part of this habit is certainly that I don't feel like giving nosy websites that much help (and they're almost all nosy). Sure, there are pictures of me out on the Internet and they can be found through search engines, but they don't actually come helpfully confirmed as me (and in fact one of the top results right now is someone else). Places like Facebook and Twitter and so on are already trying very hard to harvest my information and I don't feel like giving them any more than the very minimum. For a long time that was all that I needed and all of the reason that I had.
These days I have another reason for refusing to provide a real picture, one involving a more abstract principle than just a reflexive habit towards 'none of your business' privacy. Put simply, I don't put up a profile picture because I've become conscious that I could do so safely, without fear of consequences due to people becoming aware of what I look like. Seeing my picture will not make people who interact with me think any less of me and the views I express. It won't lead to dismissals or insults or even threats. It won't expose me to increased risks in real life because people will know what I look like if they want to find me.
All of this sounds very routine, but there are plenty of people on the Internet for whom this is at least not a sure thing (and thus something that they have to consider consciously every time they make this choice) or even very much not true. These people don't have my freedom to casually expose my face and my name if I feel like it, with no greater consideration than a casual dislike of giving out my information. They have much bigger, much more serious worries about the whole thing, worries that I have the privilege of not even thinking about almost all of the time.
By the way, I don't think I'm accomplishing anything in particular by not using a real picture of myself now that I'm conscious of this issue. It's just a privilege that I no longer feel like taking advantage of, for my own quixotic reasons.
(You might reasonably ask 'what about using your real name?'. The honest answer there is that I am terrible with names and that particular ship sailed a very long time ago, back in the days before people were wary about littering their name around every corner of the Internet.)
PS: One obvious catalyst for me becoming more aware of this issue was the Google+ 'real names' policy and the huge controversy over it, with plenty of people giving lots of excellent arguments about why people had excellent reasons not to give out their real names (see eg the Wikipedia entry if you haven't already heard plenty about this).
PPS: Yes, I have plenty of odd habits related to intrusive websites.