Wandering Thoughts

2019-02-13

An unpleasant surprise with part of Apache's AllowOverride directive

Suppose, not entirely hypothetically, that you have a general directory hierarchy for your web server's document root, and you allow users to own and maintain subdirectories in it. In order to be friendly to users, you configure this hierarchy like the following:

Options SymLinksIfOwnerMatch
AllowOverride FileInfo AuthConfig Limit Options Indexes

This allows people to use .htaccess files in their subdirectories to do things like disable symlinks or enable automatic directory indexes (which you have turned off here by default in order to avoid unpleasant accidents, but which is inconvenient if people actually have a directory of stuff that they just want to expose).

Congratulations, you have just armed a gun pointed at your foot. Someday you may look at a random person's .htaccess in their subdirectory and discover:

Options +ExecCGI
AddHandler cgi-script .cgi

You see, as the fine documentation will explicitly tell you, the innocent looking 'AllowOverride Options' does exactly what it says on the can; it allows .htaccess files to turn on any Options directive. Some of these options are harmless, such as 'Options Indexes', while others of them are probably things that you don't want people turning on on their own without talking to you first.

(People can also turn on the full 'Options +Includes', which also allows them to run programs through the '#exec' element, as covered in mod_include's documentation. For that matter, you may not want to allow them to turn on even the more modest IncludesNOEXEC.)

To deal with this, you need to restrict what Options people can control, something like:

AllowOverride [...] Options=Indexes,[...] [...]

The Options= list is not just the options that people can turn on, it is also the options that you let them turn off, for example if they don't want symlinks to work at all in their subdirectory hierarchy.

(It's kind of a pity that Options is such a grab-bag assortment of things, but that's history for you.)

As an additional note, changing your 'AllowOverride Options' settings after the fact may be awkward, because any .htaccess file with a now-disallowed Options setting will cause the entire subdirectory hierarchy to become inaccessible. This may bias you toward very conservative initial settings until people appeal, and then perhaps narrow exemptions afterward.

(Our web server is generously configured for historical reasons; it has been there for a long time and defaults were much looser in the past, so people made use of them. We would likely have a rather different setup if we were recreating the content and configuration today from scratch.)

ApacheAOSurprise written at 22:58:41; Add Comment

2019-02-11

Thinking about the merits of 'universal' URL structures

I am reasonably fond of my URLs here on Wandering Thoughts (although I've made a mistake or two in their design), but I have potentially made life more difficult for a future me in how I've designed them. The two difficulties I've given to a future self are that my URLs are bare pages, without any extension on the end of their name, and that displaying some important pages requires a query parameter.

The former is actually quite common out there on the Internet, as many people consider the .html (or .htm) to be ugly and unaesthetic. You can find lots and lots of things that leave off the .html, at this point perhaps more than leave it on. But it does have one drawback, which is that it makes it potentially harder to move your content around. If you use URLs that look like '/a/b/page', you need a web server environment that can serve those as text/html, either by running a server-side app (as I do with DWiki) or by suitable server configuration so that such extension-less files are text/html. Meanwhile, pretty much anything is going to serve a hierarchy of .html files correctly. In that sense, a .html on the end is what I'll call a universal URL structure.

What makes a URL structure universal is that in a pinch, pretty much any web server will do to serve a static version of your files. You don't need the ability to run things on the server and you don't need any power over the server configuration (and thus even if you have the power, you don't have to use it). Did your main web server explode? Well, you can quickly dump a static version of important pages on a secondary server somewhere, bring it up with minimal configuration work, and serve the same URLs. Whatever happens, the odds are good that you can find somewhere to host your content with the same URLs.

I think that right now there are only two such universal URL structures; plain pages with .html on the end, and directories (ie, structuring everything as '/a/b/page/'). The specific mechanisms of giving a directory an index page of some kind will vary, but probably most everything can actually do it.

On the other hand, at this point in the evolution of the web and the Internet in general it doesn't make sense to worry about this. Clever URLs without .html and so on are extremely common, so it seems very likely that you'll always be able to do this without too much work. Maybe one convenient source of publishing your pages won't support it but you'll be able to find another, or easily search for configuration recipes on the web server of your choice for how to do it.

(For example, in doing some casual research for this entry I discovered that Github Pages lets you omit the .html on URLs for things that actually have them in the underlying repository. Github's server side handling of this automatically makes it all work. See this stackoverflow Q&A, and you can test it for yourself on your favorite Github Pages site, eg. I looked at Github Pages because I was thinking of it as an example of almost no effort hosting one might reach for in a pinch, and here it is already supporting what you'd need.)

PS: Having query parameters on your URLs will make your life harder here; you probably need either server side access to something on the order of Apache's RewriteCond or to add some JavaScript into all the relevant pages that will look for any query parameters and do magic things with them that will either provide the right page content or at least redirect to a better URL.

(DWiki has decent reasons for using query parameters, but I feel like perhaps I should have tried harder or been cleverer.)

UniversalUrlStructures written at 23:00:50; Add Comment

2019-01-11

A new drawback of using my custom-compiled Firefox

For years I've used a custom-compiled Firefox, with various personal modifications. Usually this works okay and I basically don't notice any difference between my version and the official version except that the branding is a bit different (and since I build from the development tree, I'm usually effectively a Firefox version or two ahead). However, I've now run into a new drawback, one that hadn't even crossed my radar until recently.

The short version is that I read a spate of news coverage of what compiler Firefox was using, starting in September with the news that Firefox was switching to clang with LTO but really picking up steam in December with some comparisons of how Firefox builds with GCC and clang compared (part 1, part 2), and then Fedora people first considered using clang (with LTO) themselves and then improved GCC so they could stick with it while still getting LTO and PGO (via Fedora Planet/People). All of this got me to try building my own Firefox with LTO (using clang), because once I paid attention the performance improvement of LTO looked kind of attractive.

I failed. I don't know if it's my set of packages, how my Fedora machines are set up, or that I don't actually know what I'm doing about configuring Firefox to build with LTO (Link-Time Optimization), but the short version is that all of my build attempts errored out and I ran out of energy to try to get it going; my personal Firefox builds are still plain non-LTO ones, which means that I'm missing out on some performance. I'm also missing out on additional performance since I would probably never try to get the PGO (Profile-Guided Optimization) bits working, as that seems even more complicated tha LTO.

On the one hand, my impression is that much of the performance benefit is on Javascript-based benchmarks and sites, and in my main Firefox instance I block almost all Javascript from almost everyone (although I'm getting a bit more relaxed about that). If I'm using Google Maps or some other Javascript heavy site, it's in the official Fedora Firefox and very soon that's going to have both PGO and LTO.

On the other hand, there are two cases where I actually do care about Javascript performance in my main Firefox and it's probably a limiting factor. The first is for our new Grafana dashboards; I usually view these in my main browser for convenience, and my typical style of dashboard winds up running rather a lot of Javascript, DOM manipulation, CSS (re)sizing, and so on that takes a visible amount of time and CPU. I don't look at our dashboards all that often, but it would be nice if they were more responsive.

The second and much bigger case is Firefox addons themselves. All WebExtensions addons are completely written in Javascript, and things like uBlock Origin are not small and do an appreciable amount of Javascript computation in the process of blocking all of the other Javascript for me. In fact, uBlock Origin has started using WebAssembly for some of its CPU-intensive internals (currently for a hostname trie, and see also, and there's also WASM lz4 stuff). Improving the performance of addons would basically improve the performance of my Firefox as a whole, since addons potentially run on everything I visit (and both uBlock Origin and uMatrix are probably active on basically every page load).

(LTO and PGO may not improve the performance of WASM and JIT'd Javascript very much, though, and hopefully much addon code is heavily JIT-optimized because it runs so often and is in a more or less permanent context.)

In the long run hopefully I'll be able to build my own version of Firefox with LTO and most of this will be irrelevant (because I'll have most of the performance of official Fedora and Mozilla builds). I'm happy to do it with either GCC or clang, whichever is easier to get going (I'd say 'works better', but I'm honest; I'll pick whichever is less hassle for me). Even if I can't get LTO going, I'm not likely to give up on my custom-compiled Firefox because my patches are fairly important to me. But the whole LTO experience has certainly given me something to think about.

(Chrome is a much more extreme case for differences between official builds and your own work or even Chromium, because only the official Google Chrome versions come with Flash magically built in. There are things that still might need Flash today, although fewer than there used to be. Your Linux distribution's Chromium builds probably come with much less Google surveillance, though.)

CustomFirefoxPerformance written at 01:25:24; Add Comment

2019-01-10

Why I still have a custom-compiled Firefox (early 2019 edition)

For years, I've had a custom-compiled version of Firefox with various personal modifications, generally built from the current development tree. The number of modifications has fluctuated significantly over time; when I first wrote about my history of custom-compiling Firefox in this 2012 entry, it was probably my minimal point for modifications. These days my version has added significantly more changes from the stock version, in larger part due to Firefox's switch to WebExtensions. The somewhat unfortunate thing about this increase in changes is that having this custom Firefox is now more than a little bit important to get the Firefox user interface I really want. Abandoning my custom-compiled Firefox would be something that I'd definitely notice.

The largest set of changes are to deal with Firefox irritations and limitations. In the irritations department, I modify Firefox's current media autoplay code to turn off autoplay for a couple of things that Firefox doesn't otherwise allow you to stop (bare videos and videos with no audio track). In the limitations department, I add a couple of new WebExtensions APIs, which turns out to be surprisingly easy; one API provides 'view page in no style', and the other provides an API to open your genuine home page (as if you did Control-N), which is not otherwise possible in standard Firefox.

(A WebExt can open about:home, but that is actually about:newtab, not your genuine home page. My actual home page is a file: URL, which can't be opened by WebExt addons.)

My longest standing change is customizing how Firefox's remote access works, which these days also has me customizing the DBus remote control. The current development tree for Firefox seems to go back and forth about whether DBus should be used under X, but I cover my bases to be sure.

For extremely historical reasons I change the Delete key to act like the Backspace key in HTML context. This is probably surplus now, because several years ago I stopped swapping Backspace and Delete so now the key I reflexively hit to scroll the page up generates a Backspace, not Delete. Anyway, these days I often use Control-Space instead, because that works even in stock Firefox setups.

(This is about:config's browser.backspace_action setting, and I don't think it's exposed in the Preferences UI any more. I don't think I'm quite up to abandoning Backspace entirely just yet, though.)

I modify Firefox's standard branding because on the one hand, I don't want my builds to be called 'Nightly' in window titles and so on, and on the other hand I don't want them to use the official icons or otherwise actually be official builds. I also turn out to have some small changes to the default preferences, in the all.js file. I could probably do most or all of these in my own prefs.js; they linger in all.js due to historical inertia. Finally, a few years ago I did a little about the mess that is Firefox's certificate manager UI by changing Firefox's name for 'private tokens' from 'Software Security Device' to the generally more accurate 'Locally Stored Token'. I'm not sure this genuinely improves things and perhaps I should drop this change just to be more standard.

(I used to manually modify my certdata.txt to remove various CAs that I didn't like, but these days I've concluded it's too much work and I use the stock one.)

Building Firefox from source, even from the development tree, does have some potentially useful side effects. For a start, custom built versions appear not to report telemetry to Mozilla, which I consider useful given Mozilla's ongoing issues. However it can also have some drawbacks (apart from those inherent in using the latest development tree), which is a matter for another entry.

As a side note, it's interesting to see that back in my 2012 entry, I'd switched from building from the development tree to building from the released source tree. I changed back to building from the development tree at some point, but I'm not sure exactly when I did that or why. Here in the Firefox Quantum era, my feeling is that using the development tree will be useful for a few years to come until the WebExts APIs get fully developed and stabilized (maybe we'll even get improvements to some irritating limitations).

(It's possible that I shifted to modifying and regularly updating the development tree because it made it easier to maintain my local changes. The drawback of modifying a release tree is that it only updates occasionally and the updates are large.)

WhyCustomFirefoxII written at 01:16:56; Add Comment

2019-01-02

You shouldn't allow Firefox to recommend things to you any more

The sad Firefox news of the time interval is Mozilla: Ad on Firefox’s new tab page was just another experiment, and also on Reddit. The important quote from the article is:

Some Firefox users yesterday started seeing an ad in the desktop version of the browser. It offers users a $20 Amazon gift card in return for booking your next hotel stay via Booking.com. We reached out to Mozilla, which confirmed the ad was a Firefox experiment and that no user data was being shared with its partners.

Mozilla of course claims that this was not an "ad"; to quote their spokesperson from the article:

“This snippet was an experiment to provide more value to Firefox users through offers provided by a partner,” a Mozilla spokesperson told VentureBeat. “It was not a paid placement or advertisement. [...]

This is horseshit, as the article notes. Regardless of whether Mozilla was getting paid for it, it was totally an ad, and that means that it is on the slippery slope towards all of the things that come with ads in general, including and especially ad-driven surveillance and data gathering. Mozilla even admitted that there was some degree of data gathering involved:

“About 25 percent of the U.S. audience who were using the latest edition of Firefox within the past five days were eligible to see it.”

In order to know who is in 'the US audience', Mozilla is collecting data on you and using it for ad targeting.

So, sadly, we've reached the point where you should go into your Firefox Preferences and disable every single thing that Mozilla would like to 'recommend' to you on your home page (or elsewhere). At the moment that is in the Home tab of Preferences, and is only 'Recommended by Pocket' and 'Snippets'; however, you should probably check back in every new version of Firefox to see if Mozilla has added anything new. This goes along with turning off Mozilla's ability to run Firefox studies and collect data from you and probably not running Firefox Nightly.

This may or may not prevent Mozilla from gathering data on you, but at least you've made your views clear to Mozilla and they can't honestly claim that they're acting innocently (as with SHIELD studies). They'll do so anyway, because that's how Mozilla is now, but we do what we can do. In fact, this specific issue is a manifestation of what I wrote in the aftermath of last year's explosion, where Mozilla promised to stop abusing the SHIELD system but that was mostly empty because they had other mechanisms available that would abuse people's trust in them. They have now demonstrated this by their use of the 'Snippets' system to push ads on people, and they're probably going to use every other technical mechanism that they have sooner or later.

The obvious end point is that Mozilla will resort to pushing this sort of thing as part of Firefox version updates, which means that you will have to inspect every new version carefully (at least all of the preferences) and perhaps stop upgrading or switch to custom builds of Firefox that have things stripped out, perhaps GNU IceCat.

(Possibly Debian will strip these things out of their version of Firefox should this come to pass. I wouldn't count on Ubuntu to do so. People on Windows or OS X are unfortunately on their own.)

PS: Chrome and Chromium are still probably worse from a privacy perspective, and they are certainly worse for addons safety, which you should definitely be worried about if you use addons at all.

FirefoxNoRecommendations written at 16:12:56; Add Comment

2018-12-14

Why our Grafana URLs always require HTTP Basic Authentication

As part of our new metrics and monitoring setup, we have a Grafana server for our dashboards that sits behind an Apache reverse proxy. The Apache server also acts as a reverse proxy for several other things, all of which live behind the same website under different URLs.

People here would like to be able to directly access our Grafana dashboards from the outside world without having to bring up a VPN or the like. We're not comfortable with exposing Grafana or our dashboards to the unrestricted Internet, so that external access needs to be limited and authenticated. As usual, we've used our standard approach of Apache HTTP Basic Authentication, restricting the list of users to system staff.

Now, having to authenticate all of the time to see dashboards is annoying, so it would be nice to offer basic anonymous access to Grafana for people who are on our inside networks (and Grafana itself supports anonymous access). Apache can support this in combination with HTTP Basic Authentication; you just use a RequireAny block. Here's an example:

<Location ...>
   AuthType Basic
   [...]

   <RequireAny>
      Require ip 127.0.0.0/8
      Require ip 128.100.3.0/24
      [...]
      Require valid-user
   </RequireAny>
</Location>

People outside the listed networks will be forced to use Basic Auth; people on them get anonymous access.

It's also useful for system staff to have accounts in Grafana itself, because having a Grafana account means you can build your own dashboards and maybe even edit our existing ones (or share your dashboards with other staff members and edit them and so on). Grafana supports a number of ways of doing this, including local in-Grafana accounts with separate passwords, LDAP authentication, and HTTP Basic Authentication. For obvious reasons, we don't want people to have to obtain and manage separate Grafana accounts (it would be a pain in the rear for everyone). Since we're already using HTTP Basic Authentication to control some access to Grafana, reusing that for Grafana accounts makes a lot of sense; for instance, if you're accessing the server from the outside, it means that you don't have to first authenticate to Apache and then log in to Grafana if you want non-anonymous access.

But this hypothetical setup leaves us with a problem: how do you log in to Grafana when you're on our inside networks, where you won't be required to use HTTP Basic Authentication? It would be a terrible experience if you could only use your Grafana account if you weren't at work.

Before I set the server up and started experimenting, what I was hoping was that HTTP Basic Authentication was treated somewhat like cookies, in that once a browser was challenged to authenticate, it would then send the relevant Authorization header on all further accesses to the entire website. There are other areas of our web server that always require HTTP Basic Authentication, even from our internal networks, so if Basic Auth worked like cookies, you could go to one of them to force Basic Auth on, then go to a Grafana URL and the browser would automatically send an Authorization header and Apache would pass it to Grafana and Grafana would have you logged in to your account.

Unfortunately browsers do not treat HTTP Basic Authentication this way, which is not really surprising since RFC 7617 recommends a different approach in section 2.2. What RFC 7617 recommends and what I believe browsers do is that HTTP Basic Authentication is scoped to a URL path on the server. Browsers will only preemptively send the Authorization header to things in the same directory or under it; they won't send it to other, unrelated directories.

(If a browser gets a '401 Unauthorized' reply that asks for a realm that the browser knows the authorization for, it will automatically retry with that authorization. But then you're requiring HTTP Basic Authentication in general.)

The simplest, least hacky way out of this for us is to give up on the idea of anonymous access to Grafana, so that's what we've done. And that is why access to our Grafana URLs always requires HTTP Basic Authentication, however somewhat inconvenient and annoying it is. We have to always require HTTP Basic Authentication so that people can have and readily use frictionless Grafana accounts.

(As I mentioned in my entry on why we like Apache HTTP Basic Authentication, we're not willing to trust the authentication of requests from the Internet to Grafana itself. There are too many things that could go wrong even if Grafana was using, say, a LDAP backend. Fundamentally Grafana is not security software; it's written by people for whom security and authentication is secondary to dashboards and graphs.)

Sidebar: The theoretical hack around this

In theory, if browsers behave as RFC 7617 suggests, we can get around this with a hack. The most straightforward way is to have a web page at the root of the web server that we've specifically configured to require HTTP Basic Authentication; call this page /login.html. When you visit this page and get challenged, in theory your browser will decide that the scope of the authentication is the entire web server and thus send the Authorization header on all further requests to the server, including to Grafana URLs.

However I'm not sure this actually works in all common browsers (I haven't tested it) and it feels like a fragile and hard to explain thing. 'Go to this unrelated URL to log in to your Grafana account' just sounds wrong. 'You always have to use HTTP Basic Authentication' is at least a lot simpler.

GrafanaWhyAlwaysBasicAuth written at 01:00:46; Add Comment

2018-12-09

Firefox, WebExtensions, and Content Security Policies

Today, Grant Taylor left some comments on entries here (eg yesterday's). As I often do for people who leave comments here who include their home page, I went and visited Taylor's home page in my usual Firefox browser, then when I was done I made my usual and automatic Foxygestures gesture to close the page. Except that this time, nothing happened (well, Firefox's right mouse button popup menu eventually came up when I released the mouse button).

In the old days of Firefox's original XUL/XPCOM-based addons, while it was true that you could write addons mostly or entirely in Javascript, that was sort of a side effect of the fact that a significant amount of Firefox itself was implemented in Javascript. There was not really a special API or language environment that was designed just for addons, as far as I know; instead your addon was in large part a bit of Firefox that wasn't part of its main code. As a result, it was generally the case that anything Firefox could do, your addon could do, including to web pages.

Modern WebExtensions are completely different. They exist within a limited sandbox, and that sandbox is specifically Javascript based. It has its own limited set of APIs and for various reasons the APIs don't give WebExts unrestricted access to web pages. Instead WebExts must inject (Javascript) code into the web page's environment and then communicate with it. Because WebExtensions operate by injecting Javascript code into web page's environment, they (and their code) raises various questions about permissions.

(For instance, WebExtensions are not allowed to inject code into some web pages or otherwise touch them.)

Enter Content Security Policies. CSPs are a way for the web server to say what a web page is allowed to do, including where Javascript is allowed to be executed from, including if it's allowed to be executed from the page itself. In the old XUL/XPCOM world of addons, there was no question that addons had full access to a page even when the page had a CSP, because the browser itself had that access. In the new WebExtensions world where addons operate in part by injecting Javascript into the page's environment, there is a potential conflict between WebExtensions and a page with a restrictive CSP. Chrome (or if you prefer) Chromium specifically allows WebExtension injected Javascript to run, no matter what the page's CSP. Firefox currently blocks at least some things, per Firefox bug 1267027.

It's definitely clear that Foxygestures not working on Taylor's site is because of the site's Content Security Policy headers (I can make it work and not work by toggling security.csp.enable), but it's not clear why. Foxygestures is at least partly using content scripts, which are supposed to not be subject to this issue if I'm reading the Firefox bug correctly, but perhaps there's something peculiar going on in what Foxygestures does in them. Firefox objects to one part of the current Content-Security-Policy header, which perhaps switches it to some extra-paranoid mode.

(I filed Foxygestures issue 283, if only so perhaps similar cases in the future have something to search for. There is Foxygestures issue 230, but in that the gestures still worked, the UI just had limitations.)

PS: This is where I wish for a Firefox addon that allows me to set or modify the CSP of web page(s) for debugging purposes, which I believe is at least possible in the WebExtensions API. Laboratory will do part of this but it doesn't seem to start from any existing site CSP, so the 'modifying' bit of my desires is at least awkward. mHeaderControl would let me block the CSPs of selected sites, at least in theory (I haven't tried it). It's a little bit surprising to me that you don't seem to be able to do this from within the Firefox developer tools, but perhaps the Firefox people thought this was a little bit too dangerous to provide.

FirefoxWebExtsVsCSP written at 02:50:01; Add Comment

2018-12-07

Why we like HTTP Basic Authentication in Apache so much

Our web server of choice here is Apache, and when we need some sort of access control for it for people, our usual choice of method is HTTP Basic Authentication (also MDN). This is an unusual choice these days; most people use much more sophisticated and user-friendlier schemes, usually based on cookies and login forms and so on. We persist with HTTP Basic Authentication in Apache despite this because, from our perspective, it has three great advantages.

The first advantage is that it uses a username and a password that people already have, because we invariably reuse our existing Unix logins and passwords. This gets us out of more than making people remember (or note down) another login and password; it also means that we don't have to build and operate another account system (with creation, management, removal, tracking which Unix login has which web account, and so on). The follow on benefit from this is that it is very easy to put authentication restrictions on something, because we need basically no new infrastructure.

The second advantage is that because we use HTTP Basic Authentication in Apache itself, we can use it to protect anything. Apache is perfectly happy to impose authentication in front of static files, entire directory hierarchies, CGIs, or full scale web applications, whatever you want. For CGIs and full scale web applications, you can generally pass on the authenticated user name, which comes in handy for things that want that sort of information. This makes it quite easy to build a new service that needs authentication, since all of the work is done for you.

The third advantage is that when we put HTTP Basic Authentication in front of something, we don't have to trust that thing as much. This isn't just an issue of whether we trust its own authentication system (when it has one); it's also how much we want to have to trust the attack surface it exposes to unauthenticated people. When Apache requires HTTP Basic Authentication up front, there is no attack surface exposed to unauthenticated people; to even start talking to the real web app, you have to have valid login credentials. We have to trust Apache, but we were doing that already.

(Of course this does nothing to protect us from someone who can get the login credentials of a user who has access to whatever it is, but that exposure is always there.)

In an environment of sophisticated web services and web setups, there are probably ways to get all of this with something other than HTTP Basic Authentication. However, we don't have such an environment. We do not do a lot with web servers and web services, and our need for authentication is confined to things like our account request handling system, our self-serve DHCP registration portals, small CGI frontends to let people avoid the Unix command line, and various internal sysadmin services. At this modest level, the ease of Apache's Basic HTTP Authentication is very much appreciated.

ApacheBasicAuthWhy written at 23:56:35; Add Comment

2018-12-03

Wget is not welcome here any more (sort of)

Today, someone at a large chipmaker that will go unnamed decided (or apparently decided) that they would like their own archived copy of Wandering Thoughts. So they did what one does here; they got out wget, pointed it at the front page of the blog, and let it go. I was lucky in a way; they started this at 18:05 EST and I coincidentally looked at my logs around 19:25, at which point they had already made around 3,000 requests because that's what wget does when you turn it loose. This is not the first time that people have had the bright idea to just turn to wget to copy part or all of Wandering Thoughts (someone else did it in early October, for example), and it will not be the last time. However, it will be the last time they're going to be even partially successful, because I've now blocked wget's default User-Agent.

I'm not doing this because I'm under any illusions that this will stop people from grabbing a copy of Wandering Thoughts, and in fact I don't care if people do that; if nothing else, there are plenty of alternatives to wget (starting with, say, curl). I'm doing this because wget's spidering options are dangerous by default. If you do the most simple, most obvious thing with wget, you flood your target site and perhaps even spill over from it to other sites. And, to be clear and in line with my general views, these unfortunate results aren't the fault of the people using wget. The people using wget to copy Wandering Thoughts are following the obvious path of least resistance, and it is not their fault that this is actually a bad idea.

(I could hope that someday wget will change its defaults so that they're not dangerous, but given the discussion in its manual about options like --random-wait, I am not going to hold my breath on that one.)

Wget is a power tool without adequate safeguards for today's web, so if you are going to use it on Wandering Thoughts, all I can do is force you to at least slow down, go out of your way a little bit, and perhaps think about what you're doing. This doesn't guarantee that people who want to use wget on Wandering Thoughts will actually set it up right so that it behaves well, but there is now at least a chance. And if they configure wget so that it works but don't make it behave well, I'm going to feel much less charitable about the situation; these people will have chosen to deliberately climb over a fence, even if it is a low fence.

As a side note, one reason that I'm willing to do this at all is that I've checked the logs here going back a reasonable amount of time and found basically no non-spidering use of wget. There is a trace amount of it and I am sorry for the people behind that trace amount, but. Please just switch to curl.

(I've considered making my wget block send a redirect to a page that explains the situation, but that would take more energy and more wrestling with Apache .htaccess than I currently have. Perhaps if it comes up a lot.)

PS: The people responsible for the October incident actually emailed me and were quite apologetic about how their wget usage had gotten away from them. That it did get away from them despite them trying to do a reasonable job shows just how sharp-edged a tool wget can be.

PPS: I'm somewhat goring my own ox with this, because I have a set of little wget-based tools and now I'm going to have to figure out what I want to do with them to keep them working on here.

WgetNoMoreHere written at 21:20:05; Add Comment

2018-11-25

Firefox's middle-click behavior on HTML links on Linux

When I wrote about my unusual use for Firefox's Private Browsing mode, I lamented in an aside that you couldn't attach custom behavior to middle-clicking links with modifier keys held down, at least on Linux. This raised an obvious question, namely what are the various behaviors of middle-clicking links on Linux with various modifier keys held down.

So here they are, for posterity, as of Firefox 63 or so:

Middle click or Shift + middle click Your default 'open link in' behavior, either a new tab or a new window. For me, a new window.
Ctrl + middle click The alternate to your plain middle click behavior (so opening a new tab in the background for me).
Shift + Ctrl + middle click Open link in a new tab and then do the reverse of your 'when you open a link in a new tab, switch to it immediately' preference.

If you have Firefox in its default preferences, where opening links in a new tab doesn't switch to it immediately, shift + ctrl + middle click will immediately switch to the new tab. If you have Firefox set to switch to new tabs immediately, shift + ctrl + middle click opens new tabs in the background.

Firefox on Linux appears to entirely ignore both Alt and Meta (aka Super) when handling middle clicks. It probably ignores other modifiers too, but I don't have any way of generating either CapsLock or NumLock in my X setup for testing. Note that your window manager setup may attach special meaning to Alt + middle clicks in windows (or Alt + the middle mouse button in general) that preempt the click from getting to Firefox; this was the case for me until I realized and turned it off temporarily for testing.

You might also wonder about modifiers on left clicks on links. In general, it turns out that adding modifiers to a left click turns it into a middle click. There is one interesting exception, which is that Alt plus left click ignores the link and turns your click into a regular mouse click on text; this is convenient for double-clicking words in links, or single-clicking to select sub-word portions of things.

(Perhaps I knew this at one point but forgot it or demoted it to reflexive memory. There's a fair amount about basic Firefox usage that I don't really think about and don't know consciously any more.)

Sadly, I suspect that the Firefox people wouldn't be interested in letting extensions attach custom behavior to Alt + middle clicks on links (with or without other modifiers), or Meta + middle clicks. These are really the only two modifiers that could sensibly have their behavior altered or modified, but since they're already ignored, allowing extensions to interpret them might cause disruption to users who've gotten used to Firefox not caring about either when middle-clicking.

As a side note, Shift plus the scroll wheel buttons changes the scroll wheel from scrolling up and down to scrolling left and right. Ctrl plus the scroll wheel buttons is text zoom, which is probably well known (certainly I knew it). Alt plus the scroll wheel is 'go forward/back one page', which I didn't know. Shift or Meta plus any other modifiers reverts the scroll wheel to its default 'scroll up/down' behavior, and Meta plus the scroll wheel also gives you the default behavior.

PS: Modifiers don't appear to change the behavior of right clicking at all; I always get the popup menu. The same is true if your mouse has physical rocker buttons, which Firefox automatically interprets as 'go forward one page' and 'go back one page'.

Update: There's a bunch of great additional information in the comments from James, including a useful piece of information about Shift plus right click. If you're interested in this stuff, you want to read them too.

FirefoxMiddleClickOnLinux written at 19:21:32; Add Comment

(Previous 10 or go back to November 2018 at 2018/11/02)

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.