Wandering Thoughts

2015-03-28

All browsers need a (good) way to flush memorized HTTP redirects

As far as I know, basically all browsers cache HTTP redirects by default, especially permanent ones. If you send your redirects without cache-control headers (and why would you do that for a permanent redirect), they may well cache them for a very long time. In at least Firefox, these memorized redirects are extremely persistent and seem basically impossible to get rid of in any easy way (having Firefox clear your local (disk) cache certainly doesn't do it).

This is a bad mistake. The theory is that a permanent redirect is, well, permanent. The reality is that websites periodically send permanent redirects that are not in fact permanent (cf) and they disappear after a while. Except, of course, that when your browser more or less permanently memorizes this temporary permanent redirect, it's nigh-permanent for you. So browsers should have a way to flush such a HTTP redirect just as they have shift-reload to force a real cache-bypassing page refresh.

(A normal shift-reload won't do it because HTTP redirections are attached to following links, not to pages (okay, technically they're attached to looking up URLs). You could make it so that a shift-reload on a page flushes any memorized HTTP redirections for any link on the page, but that would be both kind of weird and not sufficient in a world where JavaScript can materialize its own links as it feels like.)

Sidebar: Some notes about this and Firefox

I've read some suggestions that Firefox will do this if you either tell Firefox to remove the site's browsing history entirely or delete your entire cache directory by hand. Neither are what are I consider adequate solutions; one has drastic side effects and the other requires quite obscure by-hand action. I want something within the browser that is no more effort and impact than 'Preferences / Advanced / Network / Clear cached web content'. It actually irritates me that telling Firefox to clear cached content does not also discard memorized HTTP redirections, but it clearly doesn't.

If you have some degree of control over the target website, you can force Firefox to drop the memorized HTTP redirection by redirecting back to the right version. This is generally only going to be useful in some situations, eg if you have the same site available under multiple names.

web/BrowsersAndMemorizedRedirects written at 00:28:58; Add Comment

2015-03-27

Looking more deeply into some SMTP authentication probes

Back when I wrote about how fast spammers showed up to probe our new authenticated SMTP service, I said:

[...] I can see that in fact the only people who have gotten far enough to actually try to authenticate are a few of our own users. Since our authenticated SMTP service is still in testing and hasn't been advertised, I suspect that some people are using MUAs (or other software) that simply try authenticated SMTP against their IMAP server just to see if it works.

Ha ha. Silly me. Nothing as innocent as this is what's actually going on. When I looked into this in more detail, I noticed that the IP addresses making these SMTP authentication attempts were completely implausible for the users involved (unless, for example, some of them had abruptly relocated to Eastern Europe). Further, some of the time the authentication attempts were against usernames that were their full email addresses ('<user>@ourdomain') instead of just their login name. But they were all real usernames, which is what I noticed first.

What I think has to be going on is that attackers are mining actual email addresses from domains in order to determine (or guess) the logins to try for SMTP authentication probes, rather than following the SSH pattern of brute force attacks against a long laundry list of login names. This makes a lot of sense for attackers; why waste time and potentially set off alerts when you can do a more targeted attack using resources that are already widely available (namely, lists of email addresses at target domains).

(Not all of the logins that have been tried so far are currently valid ones, but all but one of them have been valid in the past or at least appear in web searches.)

So far there's only been a small volume of these probes. It's quite possible that in the long run these will turn out to be in the minority and the only reason I'm noticing them right now is that other attackers have yet to discover us and start more noisy bulk attempts.

(I suspect that at least some of the attackers are doing this because of the presence of the DNS name 'smtp.cs.toronto.edu'.)

spam/SMTPAuthProbesFromEmail written at 01:04:44; Add Comment

2015-03-26

Why systemd should have ignored SysV init script LSB dependencies

In his (first) comment on my recent entry on program behavior and bugs, Ben Cotton asked:

Is it better [for systemd] to ignore the additional [LSB dependency] information for SysV init scripts even if that means scripts that have complete information can't take advantage of it?

My answer is that yes, systemd should have ignored the LSB dependency information for System V init scripts. By doing so it would have had (or maintained) the full System V init compatibility that it doesn't currently have.

Systemd has System V init compatibility at all because it is and was absolutely necessary for systemd to be adopted. Systemd very much wants you to do everything with native systemd unit files, but the systemd authors understood that if systemd only supported its own files, there would be a massive problem; any distribution and any person that wanted to switch to systemd would have to rewrite every SysV init script they had all at once. To take over from System V init at all, it was necessary for systemd to give people a gradual transition instead of a massive flag day exercise. However, the important thing is that this was always intended as a transition; the long run goal of systemd is to see all System V init scripts replaced by units files. This is the expected path for distributions and systems that move to systemd (and has generally come to pass).

It was entirely foreseeable that some System V init scripts would have inaccurate LSB dependency information, especially in distributions that have previously made no use of it. Supporting LSB dependencies in existing SysV init scripts is not particularly important to systemd's long term goals because all of those scripts are supposed to turn into units files (with real and always-used dependency information). In the short term, this support allows systemd to boot a system that uses a lot of correctly written LSB init scripts somewhat faster than it would otherwise have, at the cost of adding a certain amount of extra code to systemd (to parse the LSB comments et al) and foreseeably causing a certain amount of existing init scripts (and services) with inaccurate LSB comments to malfunction in various ways.

(Worse, the init scripts that are likely to stick around the longest are exactly the least well maintained, least attended, most crufty, and least likely to be correct init scripts. Well maintained packages will migrate to native systemd units relatively rapidly; it's the neglected ones or third-party ones that won't get updated.)

So, in short: by using LSB dependencies in SysV init script comments, systemd got no long term benefit and slightly faster booting in the short term on some systems, at the cost of extra code and breaking some systems. It's my view that this was (and is) a bad tradeoff. Had systemd ignored LSB dependencies, it would have less code and fewer broken setups at what I strongly believe is a small or trivial cost.

linux/SystemdLSBDependenciesMistake written at 00:21:46; Add Comment

2015-03-25

A significant amount of programming is done by superstition

Ben Cotton wrote in a comment here:

[...] Failure to adhere to a standard while on the surface making use of it is a bug. It's not a SySV init bug, but a bug in the particular init script. Why write the information at all if it's not going to be used, and especially if it could cause unexpected behavior? [...]

The uncomfortable answer to why this happens is that a significant amount of programming in the real world is done partly through what I'll call superstition and mythology.

In practice, very few people study the primary sources (or even authoritative secondary sources) when they're programming and then work forward from first principles; instead they find convenient references, copy and adapt code that they find lying around in various places (including the Internet), and repeat things that they've done before with whatever variations are necessary this time around. If it works, ship it. If it doesn't work, fiddle things until it does. What this creates is a body of superstition and imitation. You don't necessarily write things because they're what's necessary and minimal, or because you fully understand them; instead you write things because they're what people before you have done (including your past self) and the result works when you try it.

(Even if you learned your programming language from primary or high quality secondary sources, this deep knowledge fades over time in most people. It's easy for bits of it to get overwritten by things that are basically folk wisdom, especially because there can be little nuggets of important truth in programming folk wisdom.)

All of this is of course magnified when you're working on secondary artifacts for your program like Makefiles, install scripts, and yes, init scripts. These aren't the important focus of your work (that's the program code itself), they're just a necessary overhead to get everything to go, something you usually bang out more or less at the end of the project and probably without spending the time to do deep research on how to do them exactly right. You grab a starting point from somewhere, cut out the bits that you know don't apply to you, modify the bits you need, test it to see if it works, and then you ship it.

(If you say that you don't take this relatively fast road for Linux init scripts, I'll raise my eyebrows a lot. You've really read the LSB specification for init scripts and your distribution's distro-specific documentation? If so, you're almost certainly a Debian Developer or the equivalent specialist for other distributions.)

So in this case the answer to Ben Cotton's question is that people didn't deliberately write incorrect LSB dependency information. Instead they either copied an existing init script or thought (through superstition aka folk wisdom) that init scripts needed LSB headers that looked like this. When the results worked on a System V init system, people shipped them.

This isn't something that we like to think about as programmers, because we'd really rather believe that we're always working from scratch and only writing the completely correct stuff that really has to be there; 'cut and paste programming' is a pejorative most of the time. But the reality is that almost no one has the time to check authoritative sources every time; inevitably we wind up depending on our memory, and it's all too easy for our fallible memories to get 'contaminated' with code we've seen, folk wisdom we've heard, and so on.

(And that's the best case, without any looking around for examples that we can crib from when we're dealing with a somewhat complex area that we don't have the time to learn in depth. I don't always take code itself from examples, but I've certainly taken lots of 'this is how to do <X> with this package' structural advice from them. After all, that's what they're there for; good examples are explicitly there so you can see how things are supposed to be done. But that means bad examples or imperfectly understood ones add things that don't actually have to be there or that are subtly wrong (consider, for example, omitted error checks).)

programming/ProgrammingViaSuperstition written at 02:16:41; Add Comment

2015-03-24

What is and isn't a bug in software

In response to my entry on how systemd is not fully SysV init compatible because it pays attention to LSB dependency comments when SysV init does not, Ben Cotton wrote in a comment:

I'd argue that "But I was depending on that bug!" is generally a poor justification for not fixing a bug.

I strongly disagree with this view at two levels.

The first level is simple: this is not a bug in the first place. Specifically, it's not an omission or a bug that System V init doesn't pay attention to LSB comments; it's how SysV init behaves and has behaved from the start. SysV init runs things in the order they are in the rcN.d directory and that is it. In a SysV init world you are perfectly entitled to put whatever you want to into your script comments, make symlinks by hand, and expect SysV init to run them in the order of your symlinks. Anything that does not do this is not fully SysV init compatible. As a direct consequence of this, people who put incorrect information into the comments of their init scripts were not 'relying on a bug' (and their init scripts did not have a bug; at most they had a mistake in the form of an inaccurate comment).

(People make lots of mistakes and inaccuracies in comments, because the comments do not matter in SysV init (very little matters in SysV init).)

The second level is both more philosophical and more pragmatic and is about backwards compatibility. In practice, what is and is not a bug is defined by what your program accepts. The more that people do something and your program accepts it, the more that thing is not a bug. It is instead 'how your program works'. This is the imperative of actually using a program, because to use a program people must conform to what the program does and does not do. It does not matter whether or not you ever intended your program to behave that way; that it behaves the way it does creates a hard reality on the ground. That you left it alone over time increases the strength of that reality.

If you go back later and say 'well, this is a bug so I'm fixing it', you must live up to a fundamental fact: you are changing the behavior of your program in a way that will hurt people. It does not matter to people why you are doing this; you can say that you are doing it because the old behavior was a mistake, because the old behavior was a bug, because the new behavior is better, because the new behavior is needed for future improvements, or whatever. People do not care. You have broken backwards compatibility and you are making people do work, possibly pointless work (for them).

To say 'well, the old behavior was a bug and you should not have counted on it and it serves you right' is robot logic, not human logic.

This robot logic is of course extremely attractive to programmers, because we like fixing what are to us bugs. But regardless of how we feel about them, these are not necessarily bugs to the people who use our programs; they are instead how the program works today. When we change that, well, we change how our programs work. We should own up to that and we should make sure that the gain from that change is worth the pain it will cause people, not hide behind the excuse of 'well, we're fixing a bug here'.

(This shows up all over. See, for example, the increasingly aggressive optimizations of C compilers that periodically break code, sometimes in very dangerous ways, and how users of those compilers react to this. 'The standard allows us to do this, your code is a bug' is an excuse loved by compiler writers and basically no one else.)

programming/ProgramBehaviorAndBugs written at 01:46:45; Add Comment

2015-03-23

Systemd is not fully backwards compatible with System V init scripts

One of systemd's selling points is that it's backwards compatible with your existing System V init scripts, so that you can do a gradual transition instead of having to immediately convert all of your existing SysV init scripts to systemd .service files. For the most part this works as advertised and much of the time it works. However, there are areas where systemd has chosen to be deliberately incompatible with SysV init scripts.

If you look at some System V init scripts, you will find comment blocks at the start that look something like this:

### BEGIN INIT INFO
# Provides:        something
# Required-Start:  $syslog otherthing
# Required-Stop:   $syslog
[....]
### END INIT INFO

These are a LSB standard for declaring various things about your init scripts, including start and stop dependencies; you can read about them here or here, no doubt among other places.

Real System V init ignores all of these because all it does is run init scripts in strictly sequential ordering based on their numbering (and names, if you have two scripts at the same numerical ordering). By contrast, systemd explicitly uses this declared dependency information to run some SysV init scripts in parallel instead of in sequential order. If your init script has this LSB comment block and declares dependencies at all, at least some versions of systemd will start it immediately once those dependencies are met even if it has not yet come up in numerical order.

(CentOS 7 has such a version of systemd, which it labels as 'systemd 208' (undoubtedly plus patches).)

Based on one of my sysadmin aphorisms, you can probably guess what happened next: some System V init scripts have this LSB comment block but declare incomplete dependencies. On a real System V init script this does nothing and thus is easily missed; in fact these scripts may have worked perfectly for a decade or more. On a systemd system such as CentOS 7, systemd will start these init scripts out of order and they will start failing, even if what they depend on is other System V init scripts instead of things now provided directly by systemd .service files.

This is a deliberate and annoying choice on systemd's part, and I maintain that it is the wrong choice. Yes, sure, in an ideal world the LSB dependencies would be completely correct and could be used to parallelize System V init scripts. But this is not an ideal world, it is the real world, and given that there's been something like a decade of the LSB dependencies being essentially irrelvant it was completely guaranteed that there would be init scripts out there that mis-declared things and thus that would malfunction under systemd's dependency based reordering.

(I'd say that the systemd people should have known better, but I rather suspect that they considered the issue and decided that it was perfectly okay with them if such 'incorrect' scripts broke. 'We don't support that' is a time-honored systemd tradition, per say separate /var filesystems.)

linux/SystemdAndSysVInitScripts written at 01:04:53; Add Comment

2015-03-22

I now feel that Red Hat Enterprise 6 is okay (although not great)

Somewhat over a year ago I wrote about why I wasn't enthused about RHEL 6. Well, it's a year later and I've now installed and run a CentOS 6 machine for an important service that requires it, and as a result of that I have to take back some of my bad opinions from that entry. My new view is that overall RHEL 6 makes an okay Linux.

I haven't changed the details of my views from the first entry. The installer is still somewhat awkward and it remains an old-fashioned transitional system (although that has its benefits). But the whole thing is perfectly usable; both installing the machine and running it haven't run into any particular roadblocks and there's a decent amount to like.

I think that part of my shift is all of our work on our CentOS 7 machines has left me a lot more familiar with both NetworkManager and how to get rid of it (and why you want to do that). These days I know to do things like tick the 'connect automatically' button when configuring the system's network connections during install, for example (even though it should be the default).

Apart from that, well, I don't have much to say. I do think that we made the right decision for our new fileserver backends when we delayed them in order to use CentOS 7, even if this was part of a substantial delay. CentOS 6 is merely okay; CentOS 7 is decently nice. And yes, I prefer systemd to upstart.

(I could write a medium sized rant about all of the annoyances in the installer, but there's no point given that CentOS 7 is out and the CentOS 7 one is much better. The state of the art in Linux installers is moving forward, even if it's moving slowly. And anyways I'm spoiled by our customized Ubuntu install images, which preseed all of the unimportant or constant answers. Probably there is some way to do this with CentOS 6/7, but we don't install enough CentOS machines for me to spend the time to work out the answers and build customized install images and so on.)

linux/RHEL6IsOkay written at 02:34:38; Add Comment

2015-03-21

Spammers show up fast when you open up port 25 (at least sometimes)

As part of adding authenticated SMTP to our environment, we recently opened up outside access to port 25 (and port 587) to a machine that hadn't had them exposed before. You can probably guess what happened next: it took less than five hours before spammers were trying to rattle the doorknobs to see if they could get in.

(Literally. I changed our firewall to allow outside access around 11:40 am and the first outside attack attempt showed up at 3:35 pm.)

While I don't have SMTP command logs, Exim does log enough information that I'm pretty sure that we got two sorts of spammers visiting. The first sort definitely tried to do either an outright spam run or a relay check, sending MAIL FROMs with various addresses (including things like 'postmaster@<our domain>'); all of these failed since they hadn't authenticated first. The other sort of spammer is a collection of machines that all EHLO as 'ylmf-pc', which is apparently a mass scanning system that attempts to brute force your SMTP authentication. So far there is no sign that they've succeeded on ours (or are even trying), and I don't know if they even manage to start up a TLS session (a necessary prerequisite to even being offered the chance to do SMTP authentication). These people showed up second, but not by much; their first attempt was at 4:04 pm.

(I have some indications that in fact they don't. On a machine that I do have SMTP command logs on, I see ylmf-pc people connect, EHLO, and then immediately disconnect without trying STARTTLS.)

It turns out that Exim has some logging for this (the magic log string is '... authenticator failed for ...') and using that I can see that in fact the only people who have gotten far enough to actually try to authenticate are a few of our own users. Since our authenticated SMTP service is still in testing and hasn't been advertised, I suspect that some people are using MUAs (or other software) that simply try authenticated SMTP against their IMAP server just to see if it works.

There are two factors here that may mean this isn't what you'll see if you stand up just any server on a new IP, which is that this server has existed for some time with IMAP exposed (and under a well known DNS name at that, one that people would naturally try if they were looking for people's IMAP servers). It's possible that existing IMAP servers get poked far more frequently and intently than other random IPs.

(Certainly I don't see anything like this level of activity on other machines where I have exposed SMTP ports.)

spam/SpammerFastArrival written at 01:03:37; Add Comment

2015-03-20

Unix's mistake with rm and directories

Welcome to Unix, land of:

; rm thing
rm: cannot remove 'thing': Is a directory
; rmdir thing
;

(And also rm may only be telling you half the story, because you can have your rmdir fail with 'rmdir: failed to remove 'thing': Directory not empty'. Gee thanks both of you.)

Let me be blunt here: this is Unix exercising robot logic. Unix knows perfectly well what you want to do, it's perfectly safe to do so, and yet Unix refuses to do it (or tell you the full problem) because you didn't use the right command. Rm will even remove directories if you just tell it 'rm -r thing', although this is more dangerous than rmdir.

Once upon a time rm had almost no choice but to do this because removing directories took special magic and special permissions (as '.' and '..' and the directory tree were maintained in user space). Those days are long over, and with them all of the logic that would have justified keeping this rm (mis)feature. It lingers on only as another piece of Unix fossilization.

(This restriction is not even truly Unixy; per Norman Wilson, Research Unix's 8th edition removed the restriction, so the very heart of Unix fixed this. Sadly very little from Research Unix V8, V9, and V10 ever made it out into the world.)

PS: Some people will now say that the Single Unix Specification (and POSIX) does not permit rm to behave this way. My view is 'nuts to the SUS on this'. Many parts of real Unixes are already not strictly POSIX compliant, so if you really have to have this you can add code to rm to behave in a strictly POSIX compliant mode if some environment variable is set. (This leads into another rant.)

(I will reluctantly concede that having unlink(2) still fail on directories instead of turning into rmdir(2) is probably safest, even if I don't entirely like it either. Some program is probably counting on the behavior and there's not too much reason to change it. Rm is different in part because it is used by people; unlink(2) is not directly. Yes, I'm waving my hands a bit.)

unix/RmDirectoryMistake written at 00:41:36; Add Comment

2015-03-19

A brief history of fiddling with Unix directories

In the beginning (say V7 Unix), Unix directories were remarkably non-special. They were basically files that the kernel knew a bit about. In particular, there was no mkdir(2) system call and the . and .. entries in each directory were real directory entries (and real hardlinks), created by hand by the mkdir program. Similarly there was no rmdir() system call and rmdir directly called unlink() on dir/.., dir/., and dir itself. To avoid the possibility of users accidentally damaging the directory tree in various ways, calling link(2) and unlink(2) on directories was restricted to the superuser.

(In part to save the superuser from themselves, commands like ln and rm then generally refused to operate on directories at all, explicitly checking for 'is this a directory' and erroring out if it was. V7 rm would remove directories with 'rm -r', but it deferred to rmdir to do the actual work. Only V7 mv has special handling for directories; it knew how to actually rename them by manipulating hardlinks to them, although this only worked when mv was run by the superuser.)

It took until 4.1 BSD or so for the kernel to take over the work of creating and deleting directories, with real mkdir() and rmdir() system calls. The kernel also picked up a rename() system call at the same time, instead of requiring mv to do the work with link(2) and unlink(2) calls; this rename() also worked on directories. This was the point, not coincidentally, where BSD directories themselves became more complicated. Interestingly, even in 4.2 BSD link(2) and unlink(2) would work on directories if you were root and mknod(2) could still be used to create them (again, if you were root), although I suspect no user level programs made use of this (and certainly rm still rejected directories as before).

(As a surprising bit of trivia, it appears that the 4.2 BSD ln lacked a specific 'is the source a directory' guard and so a superuser probably could accidentally use it to make extra hardlinks to a directory, thereby doing bad things to directory tree integrity.)

To my further surprise, raw link(2) and unlink(2) continued to work on directories as late as 4.4 BSD; it was left for other Unixes to reject this outright. Since the early Linux kernel source is relatively simple to read, I can say that Linux did from very early on. Other Unixes, I have no idea about. (I assume but don't know for sure that modern *BSD derived Unixes do reject this at the kernel level.)

(I've written other entries on aspects of Unix directories and their history: 1, 2, 3, 4.)

PS: Yes, this does mean that V7 mkdir and rmdir were setuid root, as far as I know. They did do their own permission checking in a perfectly V7-appropriate way, but in general, well, you really don't want to think too hard about V7, directory creation and deletion, and concurrency races.

In general and despite what I say about it sometimes, V7 made decisions that were appropriate for its time and its job of being a minimal system on a relatively small machine that was being operated in what was ultimately a friendly environment. Delegating proper maintenance of a core filesystem property like directory tree integrity to user code may sound very wrong to us now but I'm sure it made sense at the time (and it did things like reduce the kernel size a bit).

unix/UnixDirectoryFiddlingHistory written at 00:28:49; Add Comment

(Previous 10 or go back to March 2015 at 2015/03/18)

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.