== Some things I've learned from transitioning a website to HTTPS A while back I first added a HTTPS version of [[my personal site https://cks.mef.org/]] along side the existing HTTP version and then [[decided that I was going to actively migrate it to HTTPS PragmaticHTTPtoHTTPS]]. The whole thing has been running for a few months now, so it seems about time to write up some things I've learned from it. The first set of lessons I learned were about everything on my side, especially my own code. The first layer of problems was code et al with explicit '((http:))' bits in it; it was amazing and depressing how many places I was just automatically doing that (you could call this 'HTTP blindness' if you wanted a trendy term for it). The more subtle problem areas were things like caches, where a HTTP version of a page might be different from a HTTPS version yet I was storing them under the same cache key. I also ran into a situation where I wanted to generate output for a HTTP URL request but use the 'canonical' HTTPS URLs for links embedded in the result; this required adding a feature to DWiki. (I also found a certain amount of other software that didn't cope well. For example, the Fedora 19 version of ``mod_wsgi'' doesn't seem to cope with a single WSGI application group that's served over both HTTP and HTTPS; the _HTTPS_ environment value latches to one value and never changes.) Once I had my own code working I got to find out all sorts of depressing things about how other people's code deals with such a transition. In no particular order: * while search engines did eventually switch over to returning HTTPS results and to crawling only the HTTPS version of my site, it took a surprisingly long time (and the switch may not be complete even now, it's hard to tell). * Many syndication feed fetchers have not changed to the HTTPS version; they still request a HTTP URL then get redirected. I will reluctantly concede that [[there are sensible reasons for this behavior RespectingRedirectsDownside]]. It does mean that the HTTP redirects will probably live on forever. * There are a certain number of syndication feed fetchers that still don't deal with HTTPS feeds or at least with redirections to them. Yes, really, in 2013. Unfortunately two of these are FeedBurner and the common Planet software, both of which I at least sort of care about. This led to the 'generate HTTP version but use the canonical HTTPS links' situation for my software. * Some web spiders don't follow redirects for _robots.txt_. I decided to not redirect for that URL alone rather than block the spiders outright in the server configuration, partly because the former was a bit easier than the latter. (I already totally ban the spiders in _robots.txt_, which is one reason I wanted them to see it.) Despite all of this the process has been relatively straightforward and mostly without problems. To the extent that there were problems, I'm more or less glad to know about them (and to fix my code; it was always broken, I just didn't realize it).