How not to use Apache's RewriteRule directive in a reverse proxy

March 5, 2021

Recently we needed to set up a reverse proxy (to one of our user run web servers) that supported WebSocket for a socket.io based user application. Modern versions of Apache have a mod_proxy_wstunnel module for this, and you can find various Apache configuration instructions for how to use it on places like Stackoverflow. The other day I shot my foot off by not following these instructions exactly.

What I wrote was a configuration stanza that looked like this:

RewriteCond %{REQUEST_URI} ^/socket.io [NC]
RewriteCond %{QUERY_STRING} transport=websocket [NC]
# This is where my mistake is:
RewriteRule (.*) "ws://ourhost:port/$1" [P,L]

During some debugging I discovered that this was causing our main Apache server to make requests to the backend server that looked like 'GET //socket.io/.... HTTP/1.1'. The user's application was very unhappy with the leading double slash, as well it might be.

This is my old friend how not to use ProxyPass back in another form. The problem is that we aren't matching the leading slashes between the original path and the proxied path; we're taking the entire path of the request (with its leading /) and putting it on after another slash. The correct version of the RewriteRule, as the Apache documentation will show you, is:

RewriteRule ^/?(.*) "ws://ourhost:port/$1" [P,L]

In my example the '?' in the regular expression pattern is unnecessary since this rewrite rule can't trigger unless the request has a leading slash, but the mod_proxy_wstunnel version doesn't require such a match in its rewrite conditions. On the other hand, I'm not sure I want to enable 'GET socket.io/...' to actually work; all paths in GET requests should start with a slash.

PS: This is a ws: reverse proxy instead of a wss: reverse proxy because we don't support TLS certificates for people running user run web servers (they would be quite difficult to provide and manage). The virtual host that is reverse proxied to a user run web server can support HTTPS, and the communication between the main web server and the user run web server happens over our secure server room network.

Sidebar: How I think I made this mistake

We initially tried to get this person's reverse proxied environment working inside a <Location> block for their personal home page on our main server, where the public path was something like '/~user/thing/'. In this situation I believe that what would be the extra leading slash has already been removed by Apache's general matching, and so the first pattern would have worked. For various reasons we then shifted them over to a dedicated virtual host, with no <Location> block, and so suddenly the '(.*)' pattern was now scooping up the leading / after all.

Written on 05 March 2021.
« Systemd needs (or could use) a linter for unit files
Some views and notes on ZFS deduplication today »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Mar 5 23:57:44 2021
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.