My uncertainty over whether an URL format is actually legal
I was recently dealing with a program that runs in a configuration
that sometimes misbehaves when you ask it to create and display a
link to a relative URL like '/
'. My vague memory suggested an
alternative version of the URL that might make the program leave
it alone, one with a schema but no host, so I tried 'https:/
' and
it worked. Then I tried to find out if this is actually a proper
legal URL format, as opposed to one that browsers just make work,
and now I'm confused and uncertain.
The first relatively definite thing that I learned is that file
URLs don't need all of those slashes; a URL of 'file:/tmp
'
is perfectly valid and is interpreted the way you'd expect. This
is suggestive but not definite, since the "file" URL scheme is a
pretty peculiar thing.
An absolute URL can leave out the scheme; '//mozilla.org/' is a valid URL that means 'the root of mozilla.org in whichever of HTTP and HTTPS you're currently using' (cf). Wikipedia's section on the syntax of URLs claims that the authority section is optional. The Whatwg specification's section on URL writing requires anything starting with 'http:' and 'https:' to be written with the host (because scheme relative special URL strings require a host). This also matches the MDN description. I think this means that my 'https:/path' trick is not technically legal, even if it works in many browsers.
Pragmatically, Firefox, Chrome, Konqueror, and Lynx (all on Linux) support this, but Links doesn't (people are extremely unlikely to use Lynx or Links with this program, of course). Safari on iOS also supports this, which is the extent of my easy testing. Since Chrome on Linux works, I assume that Chrome on other platforms, including Android, will; similarly I assume desktop Safari on macOS will work, and Firefox on Windows and macOS.
(I turned to specifications because I'm not clever enough at Internet search terms to come up with a search that wasn't far, far too noisy.)
PS: When I thought that 'https:/path' might be legal, I wondered if ':/path' was also legal (with the meaning of 'the current scheme, on the current host, but definitely an absolute path'). But that's likely more not lega than 'https:/path' and probably less well supported; I haven't even tried testing it.
Sidebar: Why I care about such an odd URL
The obvious way to solve this problem would just be to put the host
in the URL. However, this would get in the way of how I test new
versions of the program in question, where I really do want a
URL that means 'the root of the web server on whatever website this
is running on'. Yes, I know, that should be '/
', but see above
about something mis-handling this sometimes in our configuration.
(I don't think it's Apache's ProxyPassReverse directive, because the URL is transformed in the HTML, and PPR doesn't touch that.)
|
|