The pragmatics of an HTTP to HTTPS transition

August 14, 2013

It started on Twitter:

@thatcks: All things considered I've decided it's time my personal website went not just https-available but all-https (with redirects from http).

@zaitcev: wait a moment, didn't you write on your blog how evil it was to redirect http?

Pete Zaitcev is quite correct; I wrote about the issue back in this entry, yet here I am redirecting everything from HTTP to HTTPS myself. There are two answers here. I'll start with the long, rambling one.

The simple answer is that I'm not doing this for security. There's almost nothing 'secure' on my personal website and right now the only person who can do anything security related on it is me (and I can move myself to https directly). I'm doing this for privacy because I feel like making a point, however pointless it is on the grander scale of things.

Of course it would still be a good idea to not redirect from HTTP if you really care about privacy; every HTTP request that gets redirected tells what is now an entirely non-hypothetical eavesdropper the URL that your visitor was requesting and often things like referers. But this is where we run into pragmatic issues, namely backwards compatibility: there are a certain amount of HTTP URLs for my site out there and I would like to not break them. Certainly not right away and probably not ever (because cool URLs don't go away).

Security (in the broad sense) is always about tradeoffs. The most secure, most privacy-enhancing option today would be to move my personal website to being a Tor hidden service with no automatic redirection, but as a side effect this would reduce my traffic to essentially nil. The most clearly usable option would be to continue using HTTP and never mind any (quixotic) privacy concerns on behalf of my few visitors. Somewhere in the middle is the right balance between security (in the form of privacy) and real usability by the visitors that I care about. This balance may be different for every situation and thus for every website.

(Part of what makes it different is what will be revealed about your visitors by this sort of initial traffic analysis. Revealing that they go to your login form is a lot different than revealing that they are trying to look up potentially sensitive things.)

Now it's time for the short one: once people have made a HTTP request it's too late for full security. It doesn't matter what reply your web server gives for the request because a non-hypothetical eavesdropper has already seen the requested URL and other associated data (including at least some of the POST body if any). If they care enough they can reverse engineer what your visitor was trying to visit even if your web server denies the URL's (HTTP) existence; this is especially the case if your web server is willing to confirm that the HTTPS version of the URL actually exists.

In short, the real purpose of refusing to redirect HTTP requests is to force people to stop making them in the first place. It adds no real security until (and unless) people do this.

(It follows that the really secure approach is to shut off your HTTP site entirely; don't even have a web server responding on port 80. If people can't connect they can't send a HTTP request to be snooped on.)

Once people have blown a certain amount of privacy by making that initial HTTP request, how your web server responds is partly a pragmatic question of how effective not redirecting them is going to be in getting rid of those inbound HTTP requests versus how usable you want to be.

In my case my guess is that almost all inbound requests will shift to HTTPS soon even if I do a friendly HTTP to HTTPS redirection and the remaining amount of inbound requests will basically never shift. The requests that will shift come from search engine lookups, links in my syndication feeds being followed, and new links that I and other people tweet, link to, and so on. The requests that won't shift are from the existing links on Twitter, in other people's blog entries, and so on. Even if I broke those links they would be unlikely to go away and to stop generating inbound requests. So on the whole I might as well redirect them; the privacy leak has already happened by the time that my webserver can do anything (assuming I keep a HTTP site at all).


Comments on this page:

From 67.180.195.209 at 2013-08-14 04:58:41:

If you are worried about potential leaks during the http to https 301 you could set the HSTS header. This causes the browser to automatically send all requests to https for a domain without needing to hit the webserver for a redirect.

The other positive is that you gain protection from man-in-the-middle attacks post first visit. This is useful where some countries hijack a connection and forcefully decrypt the connection between the proxy and the user.

Although, their is still an exposure risk during the very first visit for a user or if the HSTS cache expires it is still better than redirecting every request via a 301.

More info on HSTS: https://www.owasp.org/index.php/HTTP_Strict_Transport_Security

-- Derek

By cks at 2013-08-14 15:08:21:

Thanks for the pointer to HSTS; that's an interesting and worthwhile option (and one I've now added to my server configuration).

It's tempting to add it even to our SSL-only sites to help users who may type URLs directly and leave out the schema (in which case I think browsers often default to http).

From 75.119.247.31 at 2013-08-14 17:45:34:

If you're exploring SSL/TLS, this site is something that's worth poking around with:

Note the "Do not show the results on the boards" checkbox.

After a bit of trial error converting a bunch of work systems to HTTPS, I've found the following Apache mod_rewrite rule to create the least hassle:

   RewriteEngine  on
   RewriteCond    %{HTTPS} off
   RewriteRule    (.*) https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]

One can put that on just about any system, and it will Just Work. You don't have to worry about VirtualHosts stanzas, or ServerName variables, or if people (or scripts) access the system by IP address. The client is basically told "do the exact same request except with the HTTPS protocol scheme".

Hardcoding any value tends to cause odd results when one has to go in via VPN (or SSH port-forwarding) and the redirect makes assumptions. It's also why I hate when an administrative web page doesn't use relative paths in the HTML, and hardcodes either the hostname or IP of the device. As an IT person there are times I have to jump through hoops to get to a admin web page because of firewalls and such, and I don't want the web page making assumptions about network reachability.

From 76.113.49.212 at 2013-08-14 20:53:27:

There is also this answer:

http://www.penny-arcade.com/comic/2007/01/10/
By trs80 at 2013-08-17 00:30:02:

Yep, HSTS is the way to go. Turning off port 80 is only really an option for fresh domains.

From 87.79.78.105 at 2013-08-18 01:51:14:

I wish there were a way to denote in DNS that a domain expects HTTPS only. It’s a much-discussed idea that would extend HSTS-type protection even to the initial request, but it has unfortunately not happened so far.

Aristotle Pagaltzis

From 76.10.173.95 at 2013-08-24 01:16:44:

It follows that the really secure approach is to shut off your HTTP site entirely; don't even have a web server responding on port 80. If people can't connect they can't send a HTTP request to be snooped on.

Having just read this today, I'm wondering if any clients would send an optimistic data packet before getting the rst packet. A single data packet could still contain plenty of sensitive data.

Written on 14 August 2013.
« You should convert wikitext to HTML through an AST
My understanding of modern C undefined behavior and its effects »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Aug 14 01:17:20 2013
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.