Wandering Thoughts


Firefox's WebRender has mixed results for me on Linux

I wrote last week about how WebRender introduced bad jank in my Linux Firefox under some circumstances. However, it turns out that WebRender for me has mixed results even outside of that issue, as I reported on Twitter:

[...] In the bad news, the WebRender Firefox is clearly less responsive on CSS hovers on golangnews.com than the regular one.

(The specific issue I see is that if I wave the mouse up and down the page, the hover highlight can visibly lag behind the mouse position a bit. With WebRender off, this doesn't happen. The laggy performance shows up clearly in the Performance recordings in Web Developer tools, where I can see clear periods of very low FPS numbers and the overall average FPS is unimpressive.)

This is on my home machine, which has integrated Intel graphics (on a decent CPU) and a HiDPI screen. Today I was in the office and so using my office machine, which uses a Radeon RX 550 graphics card (because it's an AMD machine and good AMD CPUs don't have onboard GPUs) and dual non-HiDPI screens, and in very light testing my Firefox was using WebRender and didn't seem as clearly laggy on CSS hovers on golangnews.com as my home machine.

(This isn't quite a fair test because my office machine isn't running quite as recent a build of Nightly as my home machine is.)

At one level, this is unsurprising. On Linux, WebRender has long had block and allow lists that depended both on what sort of graphics you had and what screen resolution you were running at (this was in fact one of the confusing bits of WebRender on Linux, since Firefox didn't make it clear what about your setup was allowing or stopping WebRender). Presumably Mozilla has good reason for these lists, in that how well WebRender performed likely varies from environment to environment, or more exactly from some combination of GPU and resolution to other combinations.

At another level, this is disappointing. Firefox's WebRender is supposed to be a great performance improvement, delivering smooth 60 FPS animation (presumably including CSS effects), but in practice some combination of Firefox WebRender, the Linux X11 graphics stack, and my specific hardware results in clearly worse results than the old way. All of that effort on everyone's part has delivered an outcome that makes me turn off WebRender and plan to ignore it until I have no other choice. This is especially personally disappointing because WebRender is a necessary enabler for things like hardware accelerated video playback.

(I have to confess that I've held my nose and turned to Chrome for the single job of displaying a couple of sites where I really care about smooth video performance. I use Chrome Incognito windows for this, which at least limits some of the damage. I still hold my views on walking away from Chrome, but I'm a pragmatist.)

FirefoxWebRenderMixed written at 00:18:15; Add Comment


Firefox on Linux has not worked well with WebRender for me so far

A while back I wrote about my confusion over Firefox's hardware accelerated video on Linux; as part of that confusion, I attempted to turn on all of the preferences and options necessary to make hardware accelerated video work. Part of the requirements is (or was) forcing on WebRender (also), which is in large part about having the GPU do a lot more web page rendering than it does in Firefox today. Even after I seemed to not get hardware accelerated video, I left WebRender turned on in the Firefox instance I was using for this. Well, at least for a while. After a while I noticed that that Firefox instance seemed more prone to jank than it had been before when I did things like flip back to a Firefox window I hadn't been using for a while. Reverting WebRender back to the default setting of being off fixed the problem.

(I probably turned this off around the time of Firefox 81.)

Very recently, my personal build of Firefox Nightly started experiencing weird jank. Most of my Firefox windows are on my first (primary) fvwm virtual page and performed normally, but the moment I flipped to another virtual page (often to substitute for not having dual monitors right now) the Firefox window (or windows) that was on that new page would get extremely slow to respond. Today I determined that this was due to WebRender recently getting turned on by default in my Nightly environment; forcing WebRender off via gfx.webrender.force-disabled eliminated the problem. I cross-checked about:support output between my old normal Nightly build and the new Nightly build while it had the jank problem and verified that the only difference was WebRender (and hardware rendering) being turned on.

(This change is so recent it's not on the WebRender status page, which still says that WebRender is not enabled on large screens like mine on Intel GPUs. The change is bugzilla #1675768.)

Unfortunately this is not a simple problem. It's not an issue of excessive CPU or GPU usage, as far as I can tell, and it's not caused simply by having a Firefox window in an additional fvwm virtual page, because it doesn't happen in a test Firefox profile that's running the same binary. That it happens only if I move virtual pages makes it rather odd, because fvwm actually implements changing between virtual pages by moving all of the windows that aren't supposed to be there to off screen (and moving all of the other ones back on). So a Firefox window on the screen sees no difference in X11 protocol level window coordinates regardless of what virtual page fvwm is displaying (although offscreen Firefox windows will have different coordinates).

(I've also tried running Firefox with the environment variable 'INTEL_DEBUG=perf', from here, and there's no smoking gun. However, the change's bug mentions 'vsync' every so often and as far as I can see there's no way I can check for excessive waits for vsync, which could be one source of stalls.)

PS: Because I use fvwm, bug #1479135 - Black border around popups with non-compositing window manager usually makes it pretty obvious if WebRender is on in one of my Firefox instances (I use uBlock Origin's on the fly element picker a lot, which requires calling up its addon menu, which shows these black borders).

FirefoxWebRenderFailure written at 22:59:02; Add Comment


Firefox has a little handy font-related thing on Unix (or at least Linux)

On Linux in specific and Unix more generally, there is a notion of default system fonts that are known by the magic names of 'serif', 'sans-serif', and 'monospace'. What these magic names map to is determined by a complicated pile of decisions that are spread around the system in the form of Fontconfig (also). These decisions are generally fairly opaque and hard to peer into unless you're relatively experienced with command line tools.

Unless you configure it otherwise through Preferences, Firefox normally uses these standard font names if a website doesn't set specific fonts (or uses the generic names, which often show up as the last fallback in CSS). This means that Firefox is exposed to mysterious changes in what your Linux system actually maps these fonts to, which can vary. However, I discovered today that Firefox has a very convenient feature here.

If you go to the general section of Preferences and haven't customized your font choices, you will probably see the "Default font" listed as something like "Default (DejaVu Serif)". What this means is that Firefox is using the default meaning of 'serif', but is also telling you what font it actually maps to. If you go into "Advanced..." and have not customized your font choices, you can see what all three of the magic names map to. This can be very handy for sorting out strange font issues.

(This doesn't quite give you complete details, as it omits the Fontconfig style, which can vary and theoretically may make a difference.)

Firefox's Web Developer tools will give you a version of this for any web page (including ones that don't set any CSS fonts and so are using the system ones). However, as far as I can see the "fonts" pane won't tell you exactly why 'DejaVu Sans' is being used, just that it is (and that it's a system font). I also don't know if this will peer through other Fontconfig font aliases to tell you what they map to, or if it will faithfully report them under their pre-alias names.

(Because I was curious I just checked and Chrome does not show this information in its font preferences; it just tells you you're using the generic names. Chrome also appears to give less information about fonts in its Web Developer tools, but I probably don't know what I'm doing there.)

FirefoxUnixLittleFontBit written at 23:38:08; Add Comment


Web page generation systems should support remapping external URLs

Some web pages and web sites are hand authored, but many more are generated (dynamically or statically) through web page generation systems and content management systems of various sorts. Also, often our writing in these systems has links to external pages; to other people's writing, to reference documentation, to Wikipedia, to whatever. This presents us (the people running web sites and writing on them) a long term problem, because in practice some or many of those external URLs will eventually change.

Today, we don't have good support in our page generation systems for this unfortunate reality of web life. If you find out that a an external URL you reference has moved, you generally have to hunt around through all of your content and update it, either completely manually or at best semi-automatically. The unsurprising result of this is that people often don't, even when they know old links have changed; it's simply too much work to go back through everything and fix it all up.

So here's an idea: all of our web page generation systems should support a remapping file (or data source) for external URLs, which would list the old URL and its new replacement. A fancier version could also have site matching, prefix matching or general pattern matching. When you're generating a page and the page has a link pointing to such an old URL, it would automatically get replaced with the new URL. The obvious advantage of this remapping system is that it's less work; the subtle one is that it's automatically universal, with you not having to hunt down every last obscure corner of the site where the URL is mentioned.

(In some systems it would make sense to automatically edit this change into the source data; generally I think those are systems where the source data is already held in a database by the web generation system and is not edited by people by hand.)

One additional advantage of doing this in the web page generation system instead of in external tools is that the web page generator generally has the best idea if what it's really dealing with is a link target, instead of some other text that happens to mention or include the URL. You probably don't want to rewrite mentions of old URLs in plain text, for example, especially not automatically.

PS: This remapping should be applied repeatedly, because replacement URLs can themselves get replaced. Yes, sure, theoretically people could go through and update the original mappings again, but let's make it easy and as foolproof as possible. Since link rot is going to happen, we should make it easy to deal with.

(This idea was sparked by Aristotle Pagaltzis linking to a web.archive.org copy of a diveintomark.org entry in a comment on this entry, causing me to realize that I had entries with direct links to diveintomark that needed to be updated to web.archive.org. This shows both how long it can take me to write some Wandering Thoughts entries and how I still haven't gotten around to finding and editing all of those entries (or implementing a remapping file here).)

RemappingExternalLinks written at 23:35:20; Add Comment


Firefox is improving its handling of HTTP Basic Authentication (on Unix)

We're big users of HTTP Basic Authentication, and I use Firefox (and what is effectively Firefox Nightly). Recently I noticed a nice little quality of life improvement in Firefox Nightly's handling of HTTP Basic Authentication, at least for people on Unix.

(I haven't tried to check how Firefox on Windows handles HTTP Basic Authentication, either in released versions or in Nightly.)

Today, if you go to a website that requires HTTP Basic Authentication with even the very latest Firefox 81.0.1, you'll get a very old fashioned modal dialog popup (at least on Unix), with all of the many problems that those have. These days, at least the popup only blocks that Firefox window instead of all Firefox windows, but it does stop you from doing things like interacting with other tabs you have in the window, and if you used "Open Link in New Tab", Firefox force-switches you to that tab (and throws the modal dialog in your face) the moment the HTTP Basic Authentication pops up.

(If you're someone who mostly or entirely uses tabs instead of windows, this means that a HTTP Basic Authentication prompt basically locks your Firefox up.)

At some point recently, Firefox Nightly has changed this to make Firefox's authentication prompt to you effectively a modal thing attached to the tab, not a separate modal dialog popup (in other words, how Chrome behaves on Unix, more or less). This doesn't jump in front of you in quite the same way, is much less annoying, and no longer locks you out of other tabs in that Firefox window until you answer the authentication prompt.

(Firefox still force-switches you into the tab that wants you to authenticate, in contrast to Chrome, which just lets the tab sit there if it's not the current tab.)

On the one hand, this is a nice quality of life improvement that makes HTTP Basic Authentication better for Firefox people on Unix. On the other hand, it turns out that I'm so used to how Firefox works now that the new way feels a bit jarring every time I run into it (which is more often than you'd think, but that's another entry). I'm sure I'll get completely used to it in a few months, though.

FirefoxBasicAuthBetter written at 00:21:31; Add Comment


My take on permanent versus temporary HTTP redirects in general

When I started digging into the HTTP world (which was around the time I started writing DWiki), the major practical difference between permanent and temporary HTTP redirects was that browsers aggressively cached permanent redirects. This meant that permanent redirects were somewhat of a footgun; if you got something wrong about the redirect or changed your mind later, you had a problem (and other people could create problems for you). While there are ways to clear permanent redirects in browsers, they're generally so intricate that you can't count on visitors to do them (here's one way to do it in Firefox).

(Since permanent redirects fix both that the source URL is being redirected and what the target URL is, they provide not one but two ways for what you thought was permanent and fixed to need to change. In a world where cool URLs change, permanence is a dangerous assumption.)

Also, back then in theory syndication feed readers, web search engines, and other things that care about the canonical URLs of things would use a permanent redirect as a sign to update what that was. This worked some of the time in some syndication feed readers for updating feed URLs, but definitely not always; software authors had to go out of their way to do this, and there were things that could go wrong (cf). Even back in the days I don't know if web search engines paid much attention to it as a signal.

All of this got me to use temporary redirections almost all of the time, even in situations where I thought that the redirection was probably permanent. That Apache and other things made temporary redirections the default also meant that it was somewhat easier to set up my redirects as temporary instead of permanent. Using temporary redirects potentially meant somewhat more requests and a somewhat longer delay before some people with some URLs got the content, but I didn't really care, not when set against the downsides of getting a permanent redirect wrong or needing to change it after all.

In the modern world, I'm not sure how many people will have permanent HTTP redirects cached in their browsers any more. Many people browse in more constrained environments where browsers are throwing things out on a regular basis (ie phones and tablets), browsers have probably gotten at least a bit tired of people complaining about 'this redirect is stuck', and I'm sure that some people have abused that long term cache of permanent redirects to fingerprint their site visitors. On the one hand, this makes the drawback of permanent redirects less important, but on the other hand this makes their advantages smaller.

Today I still use temporary redirects most of the time, even for theoretically permanent things, but I'm not really systematic about it. Now that I've written this out, maybe I will start to be, and just say it's temporary for me for now onward unless there's a compelling reason to use a permanent redirect.

(One reason to use a permanent redirect would be if the old URL has to go away entirely at some point. Then I'd want as a strong as signal as possible that the content really has migrated, even if only some things will notice. Some is better than none, after all.)

PermanentVsTemporaryRedirects written at 23:49:17; Add Comment

Permanent versus temporary redirects when handling extra query parameters on your URLs

In yesterday's entry on what you should do about extra query parameters on your URLs, I said that you should answer with a HTTP redirect to the canonical URL of the page and that I thought this should be a permanent redirect instead of a temporary one for reasons that didn't fit into the entry. Because Aristotle Pagaltzis asked, here is why I think permanent redirects are the right option.

As far as I know, there are two differences in client behavior (including web spider behavior) between permanent HTTP redirects and temporary ones, which is that clients don't cache temporary redirects and don't consider them to change the canonical URL of the resource. If you use permanent redirects, you thus probably make it more likely that web search engines will conclude that your canonical URL really is the canonical URL and they don't need to keep re-checking the other one, at the potential downside of having browsers cache the redirect and never re-check it.

So the question is if you'll ever want to change the redirect or otherwise do something else when you get a request with those extra query parameters. My belief is that this is unlikely. To start with, you're probably not going to reuse other people's commonly used extra query parameters for real query parameters of your own, because other people use them and will likely overwrite your values with theirs.

(In related news, if you were previously using a 's=..' query parameter for your own purposes on URLs that people will share around social media, someone out there has just dumped some pain on top of you. Apparently it may be Twitter instead of my initial suspect of Slack, based on a comment on this entry.)

If you change the canonical URL of the page, you're going to need a redirect for the old canonical URL anyway, so people with the 'extra query parameters' redirect cached in their browser will just get another redirect. They can live with that.

The only remaining situation I can think of where a cached permanent redirection would be a problem would be if you want to change your web setup so that you deliberately react to specific extra query parameters (and possibly their values) by changing your redirects or rendering a different version of your page (without a redirect). This strikes me as an unlikely change for most of my readers to want to make (and I'm not sure how common customizing pages to the apparent traffic source is in general).

(Also, browsers don't cache permanent redirects forever, so you could always turn the permanent redirects into temporary ones for a few months, then start doing the special stuff.)

PS: I don't think most clients do anything much about changing the 'canonical URL' of a resource if the initial request gets a permanent redirect. Even things like syndication feed readers don't necessarily update their idea of your feed's URL if you provide permanent redirects, and web browsers are even less likely to change things like a user's bookmarks. These days, even search engines may more or less ignore it, because people do make mistakes with their permanent redirects.

HandlingExtraQueryParametersII written at 00:10:04; Add Comment


What you should do about extra query parameters on your URLs

My entry on how web server laxness created a de facto requirement to accept arbitrary query parameters on your URLs got a number of good comments, so I want to agree with and magnify the suggestion about what to do about these parameters. First off, you shouldn't reject web page requests with extra query parameters. I also believe that you shouldn't just ignore them and serve the regular version of your web page. Instead, as said by several commentators, you should answer with a HTTP redirect to the canonical URL of the web page, which will be stripped of at least the extra query parameters.

(I think that this should be a permanent HTTP redirect instead of a temporary one for reasons that don't fit within the margins of this entry. Also, this assumes that you're dealing with a GET or a HEAD request.)

Answering with a HTTP redirect instead of the page has two useful or important effects, as pointed out by commentators on that entry. First, any web search engines that are following those altered links won't index duplicate versions of your pages and get confused about which is the canonical one (or downrate you in results for having duplicate content). Second, people who copy and reshare the URL from their browser will be sharing the canonical URL, not the messed up version with tracking identifiers and other gunk. This assumes that you don't care about those tracking identifiers, but I think this is true for most of my readers.

(In addition, you can't count on other people's tracking identifiers to be preserved by third parties when your URLs get re-shared. If you want to track that sort of stuff, you probably need to add your own tracking identifier. You might care about this if, for example, you wanted to see how widely a link posted on Facebook spread.)

However, this only applies to web pages, not to API endpoints. Your API endpoints (even GET ones) should probably error out on extra query parameters unless there is some plausible reason they would ever be usefully shared through social media. If your API endpoints never respond with useful HTML to bare GETs, this probably doesn't apply. If you see a lot of this happening with your endpoints, you might make them answer with HTTP redirects to your API documentation or something like that instead of some 4xx error status.

(But you probably should also try to figure out why people are sharing the URLs of your API endpoints on social media, and other people are copying them. You may have a documentation issue.)

PS: As you might suspect, this is what DWiki does, at least for the extra query parameters that it specifically recognizes.

HandlingExtraQueryParameters written at 23:41:59; Add Comment


URL query parameters and how laxness creates de facto requirements on the web

One of the ways that DWiki (the code behind Wandering Thoughts) is unusual is that it strictly validates the query parameters it receives on URLs, including on HTTP GET requests for ordinary pages. If a HTTP request has unexpected and unsupported query parameters, such a GET request will normally fail. When I made this decision it seemed the cautious and conservative approach, but this caution has turned out to be a mistake on the modern web. In practice, all sorts of sites will generate versions of your URLs with all sorts of extra query parameters tacked on, give them to people, and expect them to work. If your website refuses to play along, (some) people won't get to see your content. On today's web, you need to accept (and then ignore) arbitrary query parameters on your URLs.

(Today's new query parameter is 's=NN', for various values of NN like '04' and '09'. I'm not sure what's generating these URLs, but it may be Slack.)

You might wonder how we got here, and that is a story of lax behavior (or, if you prefer, being liberal in what you accept). In the beginning, both Apache (for static web pages) and early web applications often ignored extra query parameters on URLs, at least on GET requests. I suspect that other early web servers also imitated Apache here, but I have less exposure to their behavior than Apache's. My guess is that this behavior wasn't deliberate, it was just the simplest way to implement both Apache and early web applications; you paid attention to what you cared about and didn't bother to explicitly check that nothing else was supplied.

When people noticed that this behavior was commonplace and widespread, they began using it. I believe that one of the early uses was for embedding 'where this link was shared' information for your own web analytics (cf), either based on your logs or using JavaScript embedded in the page. In the way of things, once this was common enough other people began helpfully tagging the links that were shared through them for you, which is why I began to see various 'utm_*' query parameters on inbound requests to Wandering Thoughts even though I never published such URLs. Web developers don't leave attractive nuisances alone for long, so soon enough people were sticking on extra query parameters to your URLs that were mostly for them and not so much for you. Facebook may have been one of the early pioneers here with their 'fbclid' parameter, but other websites have hopped on this particular train since then (as I saw recently with these 's=NN' parameters).

At this point, the practice of other websites and services adding random query parameters to your URLs that pass through them is so wide spread and common that accepting random query parameters is pretty much a practical requirement for any web content serving software that wants to see wide use and not be irritating to the people operating it. If, like DWiki, you stick to your guns and refuse to accept some or all of them, you will drop some amount of your incoming requests from real people, disappointing would be readers.

This practical requirement for URL handling is not documented in any specification, and it's probably not in most 'best practices' documentation. People writing new web serving systems that are tempted to be strict and safe and cautious get to learn about it the hard way.

In general, any laxness in actual implementations of a system can create a similar spiral of de facto requirements. Something that is permitted and is useful to people will be used, and then supporting that becomes a requirement. This is especially the case in a distributed system like the web, where any attempt to tighten the rules would only be initially supported by a minority of websites. These websites would be 'outvoted' by the vast majority of websites that allow the lax behavior and support it, because that's what happens when the vast majority work and the minority don't.

DeFactoQueryParameters written at 00:17:18; Add Comment


In practice, cool URLs change (eventually)

The idea that "cool URLs don't change" has been an article of faith for a very long time. However, at this point we have more than 20 years of experience with the web, and anyone who's been around for a significant length of time can tell you that in practice, cool URLs change all of the time (and I don't mean just minor changes like preferring HTTPS over HTTP). Over a sufficient length of time, internal site page layouts change (sometimes because URL design is hard), people move domains or hosts within a domain, and sometimes cool URLs even go away and must be resurrected, sometimes by hand (through people re-publishing and re-hosting things) and sometimes through the Wayback Machine. This decay in cool URLs is so pervasive and well recognized that we have a term for it, link rot.

(Of course, you're a good person, and your cool URLs don't change. But this is the web and we all link to each other, so it's inevitable that some other people's cool URLs that you link to will suffer from link rot.)

Despite link rot being widely recognized as very real, I think that in many way's we're in denial about it. We keep pretending (both culturally and technically) that if we wish hard enough and try hard enough (and yell at people hard enough), all important URLs will be cool URLs that are unchanging forever. But this is not the case and is never going to be the case, and it's long past time that we admitted it and started dealing with it. Whether we like it or not, it is better to deal with the world of the web as it is.

Culturally, we recite "cool URLs don't change" a lot, which makes it hard to talk about how best to evolve URLs over time, how to preserve content that you no longer want to host, and other issues like that. I don't think anyone's written a best practices document for 'so you want to stop having a web site (but people have linked to it)', never mind what a company can do to be friendly for archiving when it goes out of business or shuts down a service. And that's just scratching the surface; there's a huge conversation to be had about the web over the long term once we admit out loud that nothing is forever around here.

(The Archive Team has opinions. But there are some hard issues here; there are people who have published words on the Internet, not under CC licenses, and then decided for their own reasons that they no longer want those words on the Internet despite the fact that other people like them, linked to them a lot, and so on.)

Technically, how we design our web systems and web environments often mostly ignores the possibility of future changes in either our own cool URLs or other people's. What this means in more tangible terms is really a matter for other entries, but if you look around you can probably come up with some ideas of your own. Just look for the pain points in your own web publishing environment if either your URLs or other people's URLs changed.

(One pain point and sign of problems is that it's a thing to spider your own site to find all of the external URLs so you can check if they're still alive. Another pain point is that it can be so hard to automatically tell if a link is still there, since not all dead links either fail entirely or result in HTTP error codes. Just ask people who have links pointing to what are now parked domains.)

CoolUrlsChange written at 00:41:56; Add Comment

(Previous 10 or go back to August 2020 at 2020/08/28)

Page tools: See As Normal.
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.