Wandering Thoughts

2021-10-18

The cut and paste irritation in "smart" in-browser text editing

It's common for modern websites and browser based applications (such as Grafana and Prometheus) to have some corner of their experience where you enter text. On Twitter or the Fediverse, you may want to write tweets; in Prometheus and Grafana, you may want to enter potentially complex metrics system expressions to evaluate, and so on. HTML has a basic but perfectly functional set of mechanisms for this, in the <textarea> and '<input type="text" ...>' form elements. However, increasingly often websites and apps feel that these are not good enough, and what people really want is a more advanced experience that's more like a full blown local editor.

(Sometimes this means code editing style colours, highlighting of matching brackets, smart indentation, and autocompletion (as in Grafana and Prometheus). Other times this is augmenting the basic job of text entry with additional features, as in Twitter.)

Unfortunately, quite a lot of the time an important feature seems to get left on the cutting room floor in the process of adding this smart text editing. Namely, good support for cut and paste (in the broad version that supports copying text as well). True, native browser cut and paste has a surprisingly large range of features (especially on Unix), but web editors often fumble some or much of these experiences. For instance, on Unix (well, X, I can't speak for Wayland) you can normally highlight some text and then paste it with the middle mouse button. This works fine on normal HTML input objects (because the browsers get it right), but I have seen a wide range of behaviors on both the 'copy' and the 'paste' side with smart text editors. Some text editors only highlight text when you select and you can't copy, especially to other applications; some text editors copy but spray the text with Unicode byte order marks. Sometimes you can't paste in text with the middle mouse button, or at the least it will be misinterpreted.

The slower full scale formal, explicit Copy and Paste operations are more likely to work but aren't always safe in things that claim to be text fields. Even when they work, sometimes Ctrl-C and Ctrl-V are intercepted and you can only perform them through menus. And I've seen systems where text would claim to paste but become mangled on input.

Unfortunately, it's easy for me to imagine how this happens during web development. First off, I believe that the Unix style 'select then paste with the middle mouse button' isn't widely supported outside of Unix, so people not developing on that can easily miss it entirely. General cut and paste is widely available, but it's also a generally unsexy thing that is neither as obviously attractive nor as frequently used as typing and editing text in a text field. Most of the time you write Tweets by hand or type in metric system rule expressions (with possibly autocompletion). Copying them from or to elsewhere is much less common, and less common things garner bugs and get to be lower priority for fixing bugs. I doubt web developers for places actively want to break cut and paste; it just happens by all too easy accident.

(This is partly a grump, because I'm tired of my cut and paste not working or only half working if I remember to it in a way that I rarely use.)

WebEditingVsCutAndPaste written at 23:20:46; Add Comment

Getting some hardware acceleration for video in Firefox 93 on my Linux home desktop

A bit over a year ago, I wrote about my confusion over Firefox 80's potentially hardware accelerated video on Linux. More recently, I had some issues with Firefox's WebRender GPU-based graphics acceleration, which through some luck led to a quite useful and informative Firefox bug. The upshot of all of this, and some recent experimentation, is that I believe I've finally achieved hardware accelerated video (although verifying this was a bit challenging).

My home desktop has an Intel CPU and uses the built in Intel GPU, and I use what is called a non-compositing window manager. In order to get some degree of GPU involvement in video playback in my environment, I need to be using hardware WebRender and generally force Firefox to use EGL instead of GLX with the about:config gfx.x11-egl.force-enabled setting. This results in video playback that the intel_gpu_top monitoring program says is using about as much of the GPU as Chrome is when playing the same video. In the future no setting will be necessary, as Firefox is switching over to EGL by default. This appears to be about as good as I can get right now on my current hardware on the sort of videos that I'm most interested in playing smoothly.

The Arch Linux wiki has a much longer list of steps to get VA-API acceleration. Using those as guidelines and after some spelunking of the Firefox source and strategic use of a $MOZ_LOG value of 'PlatformDecoderModule:5,Dmabuf:5', the settings that leave things not reporting that VA-API is disabled (for my test video) is setting media.ffmpeg.vaapi.enabled to true and media.rdd-process.enabled to false. Based on an inspection of the current Firefox source code, using a RDD process disables VA-API unconditionally. The current debugging output you want to see (from the Dmabuf module) is:

D/Dmabuf nsDMABufDevice::IsDMABufVAAPIEnabled: EGL 1 DMABufEnabled 1  media_ffmpeg_vaapi_enabled 1 CanUseHardwareVideoDecoding 1 !XRE_IsRDDProcess 1

If any of those are zero, you are not going to be using VA-API today (and the platform decoder module logging will report that VA-API is disabled by the platform). Unfortunately, Firefox's about:support doesn't seem to say anything about how it's doing video playback (although it does have a section for audio), so we have to resort to this sort of digging.

(You need the Dmabuf logging to determine just why VA-API is disabled by the platform, and then having PlatformDecoderModule logging is helpful to understand what else is going on.)

However, after doing all of this to enable VA-API, it appears that VA-API doesn't accelerate decoding of my test video; the CPU and GPU usage is basically the same whether or not VA-API is theoretically enabled. Nor does Chrome seem to do any better here.

My tentative conclusion from the log output and intel_gpu_top information is that I'm probably now using the GPU in Firefox to actually display the video on screen, instead of blasting the pixels into place with the CPU, but video decoding is probably still not hardware accelerated. Running vainfo says that VA-API acceleration is available in general, so this may be either a general software issue or simply that the particular video format can't have its decoding hardware accelerated.

Firefox93MyVideoAcceleration written at 00:36:33; Add Comment

2021-09-29

My changing (citation) style of external links here on Wandering Thoughts

When I started writing Wandering Thoughts, this blog, I tended to link to external URLs using at best a few words about what they were, and sometimes even less obvious link text and surrounding text than that. This is a style that's still quite in favour on the web, in blogging and other usage, and you can find it used by many people (sometimes even me, still). However, over time I switched to more often quoting the full title of the page I was linking to, and these days my link citation style has moved towards also mentioning the author's name in the surrounding text (a recently example is in my entry on how stack size is invisible in C). This style probably comes across as titled towards academic writing, but I haven't adopted it because I work at a university.

Instead, the growing amount of information I include about the link is a quiet reaction to the unfortunate fact that over time, an increasing number of URLs will stop working. More and more I've come to feel that the more information I include about a URL, the better for the future, both for me and for other people. If I put in the full title and even the name the author uses, there's a higher chance that some copy of the page can be found in search engines, even if the domain changes, the site is restructured, the form of the URLs all change, and so on.

I feel that this is especially important for future readers, for the simple reason that they're more likely to stumble over broken old links and care about it than I am. It would be nice for me to go through all of Wandering Thoughts to find and fix up broken links, but as a practical matter I don't have either the time or the energy (especially since I'd have to do it on a regular basis, and there's no fully reliable automated way to find broken links). The more information and context on a link I can arm people with, the more chances they have to do something beyond going to the Wayback Machine and hoping.

(One of the drawbacks of this full style of including the current name or pseudonym of the link's author is that people do change their names over time and dislike people still using the old one. When I become aware of such a shift I try to go back to correct my usage, but I have to become aware of it in the first place, which is far from assured.)

ChangingLinkCitationStyle written at 22:13:32; Add Comment

2021-09-19

Microsoft's Bingbot crawler is relentless for changing pages (it seems)

I look at the web logs for Wandering Thoughts every so often. There are many variations from day to day but regardless of what other things change, one thing is as predictable as the sun rising in the morning; every day some MSN Search IP address will be the top single source of traffic, as Bingbot crawls through here. This isn't at all new, as I wrote about Bingbot being out of control back in 2018, but it's somewhere between impressive and depressing just how long this has gone on.

(There are days when Bingbot isn't the top source of traffic, but those are days when someone has turned an abusive crawler lose.)

As it turns out, there is an interesting pattern to what Bingbot is doing. While it's pretty relentless and active in general, one specific URL stands out. Yesterday Bingbot requested the front page of Wandering Thoughts a whopping 1,400 times (today isn't over but it's up to 1,300 times so far). This is a running theme; my blog's front page is by far Bingbot's most requested page regardless of the day.

(Bingbot is also obsessed with things that it can't crawl; today, for example, it made 92 requests for a page that it's barred from with a HTTP 403 response.)

The front page of Wandering Thoughts changes at least once a day (more or less) when a new entry is published, and more often if people leave comments on recent entries (as this updates the count of comments for the entry). However, it doesn't update a hundred times a day even when people are very active with their comments, and Bingbot is being ten times more aggressive than that. I was going to say that Bingbot has other options to discover updates to Wandering Thoughts, such as my Atom syndication feeds, but it turns out that I long ago barred it from fetching a category of URLs here that includes those feeds.

(I have ambivalent feelings about web crawlers fetching syndication feeds. At a minimum, they had better do it well and not excessively, which based on present evidence I suspect Bingbot would not manage.)

Now that I've discovered this Bingbot pattern, I'm tempted to bar it from fetching the front page. The easiest thing to do would be to bar Bingbot entirely, but Bing is a significant enough search engine that I'd feel bad about that (although they don't seem to send me very much search traffic). Of course that might just transfer Bingbot's attention to another of the more or less equivalent pages here that it's currently neglecting, so perhaps I should just leave things as they are even if Bingbot's behavior irritates me.

PS: Of course there could be something else about the front page of Wandering Thoughts that has attracted Bingbot's relentless attention. The reasons for web crawlers to behave as they do are ultimately opaque; all I can really do is come up with reasonable sounding theories.

BingbotAndChangedPages written at 22:02:39; Add Comment

2021-09-06

Why Nutch-based web spiders are now blocked here

Apache Nutch is, to quote its web page, "a well matured, production ready Web crawler". More specifically, it's a web crawler engine, which people can take and use to create web crawlers themselves. However, it has a little issue, which I put in a tweet:

If you write a web crawler engine, you should make it very hard for people to not fill in all of the information for a proper user-agent string (such as a URL explaining the crawling). Apache Nutch, I'm looking at you, given UA's of eg "test search engine/Nutch-1.19-SNAPSHOT".

I have some views on what User-Agent headers should include. Including an explanatory URL for your web crawler is one of my requirements; web crawlers that don't have it and draw my attention tend to get blocked here. In the case of Nutch, my attention was first drawn to a specific aggressive crawler, but then when I started looking more and more Nutch based crawlers started coming out of the woodwork, including the example in the tweet, and none of them with proper identification. Since this is a systematic issue with Nutch-based crawlers, I decided that I was not interested in playing whack-a-mole with whatever people came up with next and I was instead going to deal with the whole issue in one shot.

This puts Apache Nutch in the same category of dangerous power tools that are abused too often for my tolerances, much like my block of Wget based crawling. Someone could use Nutch competently for purposes that I don't object to donating resources to, but the odds are against them, and if they're competent enough, perhaps they will take the '/Nutch-...' string out of their user agent.

People may object that I'm making it harder for new web search engines to get established. This is a nice theory, but it doesn't match the reality of today's Internet; the odds that a new web crawler is going to be used for a new public search engine is almost zero. It's far more likely that at the best someone is establishing a new SEO or "brand optimization" company. At the worst, the web crawler will be used to find targets to drive far less desirable activity.

NutchNoMoreHere written at 23:24:08; Add Comment

2021-09-05

Firefox on Linux is still not working well with WebRender for me (again)

Almost a year ago, I had problems with Firefox's WebRender on my Linux machine, including both bad jank and bad performance under some circumstances. Back then I turned it off with the gfx.webrender.force-disabled setting in about:config and forgot about the whole thing. Recently I updated my build of Firefox Nightly to the current one and all of a sudden my problems came back. The reason for this is straightforward; as time marches on, Mozilla invariably forces people into the future, whether or not the future really works. Specifically, Mozilla first renamed the preference I used (in bug 1722055) and then forced WebRender on (in bug 1725388). WebRender is now mandatory, and in many environments it will try to do hardware WebRender (which is the part that doesn't work for me).

(It took me a bit to realize that my issue was probably WebRender again, because bug #1479135 - Black border around popups with non-compositing window manager has been fixed and WebRender Firefox is no longer obviously different than non-WebRender Firefox.)

For now, I can still get rid of my jank problems by forcing WebRender to software mode by setting gfx.webrender.software to "true". I'm not clear if the Firefox developers are as strongly against this code as they are against the non-WebRender code, but it wouldn't entirely surprise me if they are and it goes away someday, the problems various Linux users have with hardware WebRender not withstanding (NVIDIA users appear affected by some, for example). Or perhaps I am being too grumpy and cynical about Mozilla and Firefox developers.

(Because my issue is so peculiar, as covered in my original entry, I haven't filed any sort of bug about it. My results from filing Firefox bugs are hit or miss even for straightforward ones, and this one is not at all straightforward. I'm not even sure what the exact reproduction conditions are; I'd probably have to do relatively extensive testing.)

Based on looking at various things in these bugs, especially in bug 1725388, I believe that Firefox 93 is the first version that will ship with WebRender only, although Firefox 92 will ship with the preference renamed and so people who have been turning off WebRender will get a preview of what 93 will be like for them. This broadly matches the information currently on the WebRender status page.

Sidebar: My state in the official Firefox 91

In addition to my own build of Firefox Nightly, I also use an official build of Firefox 91 for some things. For historical reasons, this version has been running with $MOZ_X11_EGL set to "1", and did not exhibit the problem although Mozilla apparently enabled WebRender by default in Firefox 91 on Linux. When I took out this setting, my official Firefox 91 began having severe jank in the same situation as my original problem (in Firefox windows on fvwm virtual screens other than my first one). Without the environment variable forcing it on, Firefox's about:status reports that X11_EGL is "available by default" but "blocklisted by env: Blocklisted by gfxInfo".

Forcing gfx.webrender.software to "true" in my Firefox 91 profiles appears to fix the problem in light testing. (Time will create a more thorough test, if I keep not seeing the jank. It's very obvious jank, at least.)

FirefoxWebRenderFailureII written at 22:08:27; Add Comment

2021-08-16

Browsers listening to developers or users requires them to be humble

I was recently reading Breaking the web forward (via), which is in part about how Chrome keeps breaking the web because they can. In the process, it says:

Jim Nielsen feels that part of the issue is the lack of representation of web developers in the standardization process. That sounds great but is proven not to work.

One reason that it seems unlikely to work even if you could overcome the issues the article raises is that fundamentally, having meaningful developer representation requires that the browsers actually listen, and that requires browsers (well, their developers) to be individually and collectively humble.

The reality of the modern web is that browsers currently hold all of the power, which practically speaking means that Chrome holds all of the power. What happens is in their hands, and Chrome especially has demonstrated that it will add or remove things regardless of what the standards may or may not say. This will not be changed just by changing who is involved in the standardization process; it will only be changed by Chrome and other browsers deciding to listen and then to change their plans based on what they hear, to do things they didn't plan to do and not do things that they had planned to do.

Listening and changing your plans this way requires humbleness. This humbleness is hard. It means accepting that your carefully developed plans, specifications, and even code shouldn't be shipped, or that things you personally and organizationally think aren't a high enough priority should be worked on anyway, displacing things you care more about. And you need to do this even when you feel that you have solid usage data, great technical reasons, and so on.

Chrome and other browser development teams are many things, but I don't think many people would call them humble. Not in this way. If they were humble in this way, we wouldn't be in the current situation.

(Of course if browsers had been humble in the past they would have wound up implementing any number of bad things and not doing any number of valuable things that some people object vociferously to. And if they did try to be humble, there are any number of groups that would try to take advantage of this; I'm sure the Internet advertising industry would love to put forward some people, for example.)

BrowsersAndListening written at 17:34:59; Add Comment

2021-08-15

Chrome's automatic updates and the power it holds on the web

One of the realities of the modern web is that in general browsers hold pretty much all of the power to dictate how the web develops; it's (currently) browsers that decide what features to implement and what features to deprecate. In practice that means that Chrome (with its fairly dominant browser share) holds the power, including the power to abruptly deprecate things. But as I was thinking about this, it occurred to me that a large part of this power rests on the widespread use and acceptance of Chrome's automatic updates to itself.

Browsers mostly can't move the web forward (for better or worse) without changing themselves. This means that a limiting factor on the speed of changes to the modern web is how fast new versions of browsers can propagate. If you make a change to your browser and it takes a year to get into the hands of 50% of your users, you can't change things on the web very fast. If you can get it into the hands of 80% of your users in six weeks, suddenly you can move a lot faster.

(Both Firefox and Chrome can somewhat change their behavior without a full update, by remotely turning on feature flags or being opted in to previously inactive changes or whatever, but in general many shifts require actual updates.)

Chrome both releases frequently and tries very hard to automatically update itself. This combination drives changes in Chrome into wide use fairly fast, and in turn that is part of what gives Chrome its power on the web; if the Chrome developers decide to do something, they can make it fairly pervasive fairly fast. That gives their changes weight that the changes wouldn't have if they took months to make it to 20% of the (desktop) web.

(Based on looking at stats I could find with a casual Internet search, it looks like new Chrome versions take only a month or two to hit peak usage and then decline rapidly, which is roughly what I'd expect from their general release frequency. Chrome is apparently planning to transition to releases every four weeks, which will drive this even faster.)

PS: Of course automatic updates for browsers have important primary purposes, like promptly getting security improvements into the hands of users. But I'm sure the effects on getting general browser changes into widespread use don't hurt in the eyes of the management that has to fund all of this.

PPS: My understanding is that mobile browsers are somewhat different, especially on Android, and may update more slowly. But desktop browsers still do matter for many people and there Chrome auto-updates on its own schedule.

ChromeAutoupdatesAndPower written at 17:49:06; Add Comment

2021-08-09

The Firefox uMatrix addon is not quite dead (so far)

One of my core Firefox addons is uMatrix, which I use as my primary tool for blocking Javascript and other annoyances. When I most recently wrote about my Firefox addons back at the end of February, people commented to note that uMatrix development has stopped (eg, also). At the time I expressed hopes that uMatrix would keep on working despite this as Firefox evolved, because for me it seems to basically work fine in its current state. However, it recently turned out that uMatrix is not quite as dead as you might think.

In the middle of July, a security researcher posted about a denial of service vulnerability in uBlock Origin, uMatrix, and forks of them. The specific issue is probably not something to worry about for most people even in isolation, so I wasn't too worried that this issue was going to remain in uMatrix. But then not long after, two new versions of uMatrix were released to fix the issue and make other small changes. So despite uMatrix development being ended, Raymond Hill has been willing to fix a security issue in it.

(If you use uBlock Origin together with uMatrix, as I do, I think that you can probably turn off uMatrix's use of blocklists and so avoid this particular issue and any like it in the future. I'd assume that uBlock Origin's own blocks makes all blocklists in uMatrix unnecessary. See here for some additional commentary on that that's probably more informed than I am.)

Of course, a one time fix isn't everything. There's no guarantee that any future security issues will be fixed (especially promptly), or that more work would be done to update uMatrix to work with any future Firefox addons changes that require it. But now we know that uMatrix is not quite dead yet.

Part of my general lack of concern about uMatrix's practical future is that I have a general relaxed attitude about things that have more or less ended their development, and I use any number of them. Not all software is like a shark, where it has to keep moving or die. There's plenty of programs where you can hit the point of diminishing returns, both in improvements and in what you want to do, and in that situation it's both reasonable and natural to declare that you're done and, at most, you'll try to fix future security issues. That uMatrix's author decided they were done developing it is not a surprising or unnatural thing, and since it works for me, I'm happy to keep on using it just as I keep on using other software where the development has more or less finished.

FirefoxUMatrixNotQuiteDead written at 23:19:29; Add Comment

2021-07-30

XHTML pages cause problems for some Firefox addons

These days, XHTML is pretty much a dead issue on the web. But browsers still support it and some people do serve real XHTML web pages (for example). What I've noticed on the occasions that I visit these web pages is that their being XHTML can cause problems for some of my Firefox addons. The most common addon that I notice problems with is FoxyGestures, partly because I believe it has to inject things in order to draw gesture trails and partly because I use it all the time.

(With FoxyGestures, the typical symptoms are that it will draw the gesture trails but no other UI elements, and then when I complete a gesture nothing happens. The web developer console reports XHTML errors.)

On the one hand, this is understandable. It's quite easy to create markup that's valid (or accepted) HTML but not valid XHTML, and Firefox does still insist on your XHTML being valid as far as I know. For content modification addons to work on XHTML pages as well as HTML pages, they need to first be aware of the issue (and care about it) and then either craft markup that works on both or detect XHTML pages and use different markup. There are probably plenty of Firefox addons that quietly don't work on XHTML pages.

(This is part of why I haven't reported this as an issue to the FoxyGestures developer.)

On the other hand, that this issue exists is another suggestion that real XHTML pages are quite uncommon on the web today. If XHTML pages were common, people using addons would be running into them frequently enough for things like this to get reported and then probably fixed. Or alternately, the XHTML issue would be well enough known among addon developers that they'd program around it as a matter of course. Sometimes where bugs exist and persist is a sign of where the dark corners are.

(I've never been a fan of XHTML, but it's been a dead issue for some years, cf.)

PS: Another issue for an addon interacting with a real XHTML web page is that apparently the XHTML DOM isn't the same as the HTML DOM, so some DOM manipulation may fail or produce unexpected results (which may then lead the addon to making alterations that aren't valid XHTML).

FirefoxAddonsVsXHTML written at 00:52:53; Add Comment

(Previous 10 or go back to July 2021 at 2021/07/22)

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.