Wandering Thoughts


Why we're going to switch from SunSSH to OpenSSH on our fileservers

For a long time, the version of SSH in Illumos (and almost all derived versions) has been a version that Sun branched from OpenSSH years ago. This has various drawbacks, including that no one was working on it any more. Recently, Illumos removed SunSSH (in Illumos #7293) in favour of a more or less current version of OpenSSH that the community was already packaging and supporting. One of the pieces of fallout from this is that we are planning to switch our fileservers over from SunSSH to OpenSSH.

In theory this is unnecessary. We're running OmniOS r151014, which is just new enough to have support for switching to OpenSSH at all but which isn't (or wasn't) new enough to have OpenSSH as the default. SunSSH itself is only disappearing in the next version of OmniOS. And we really don't like changing the fileservers (we mostly consider them appliances), plus we have a mount authentication system built on top of ssh. Nominally we could keep running our systems just as they are.

In practice we consider this too risky. If there ever is any real security issue in SunSSH (assuming that there isn't already one), we expect the OmniOS response to be 'switch to OpenSSH, it's supported and basically a transparent shift'. This is perhaps against the spirit of a long term support release, but at the same time we can't expect miracles (especially as non-paying non-customers). It's clear that SunSSH is abandoned software and abandoned software just doesn't get maintenance. So we'd rather go through a planned and carefully tested shift now rather than be forced to make a sudden shift in an emergency, even if we'd sort of rather leave the whole situation alone.

(I admit that I'm looking forward to the various improvements we'll get with the shift. SunSSH is old enough that it can't talk to stock modern OpenSSH servers because OpenSSH has deprecated even the best key exchange algorithms SunSSH supports. I'm also hopeful that the OmniOS version of OpenSSH will have the significant performance improvements I saw in the Linux version the last time I tested the speeds here. And I plain like ED25519 keys, for various reasons.)

solaris/WhySwitchToOpenSSH written at 00:18:08; Add Comment


How I live without shell job control

In my comments on yesterday's entry, I mentioned that my shell doesn't support job control. At this point people who've only used modern Unix shells might manage how you get along without such a core tool as job control. The answer, at least for me, is surprisingly easily (at least most of the time).

Job control is broadly useful for three things: forcing programs to pause (and then un-pausing them), pushing programs into the background to get your shell back, and calling backgrounded programs back into the foreground. In other words, job control is one part suspending and restarting programs and one part multiplexing a single session between multiple programs.

It's possible that I'm missing important uses of being able to easily pause and unpause programs. However, I'm not missing the ability in general, because you can usually use SIGSTOP and SIGCONT by hand. I sometimes wind up doing this, although it's not something I feel the need for very often.

(I do sometimes Ctrl-C large makes if I want to do something else with my machine; with job control it's possible that I'd suspect the make instead and then have it resume afterwards.)

My approach to the 'recover my shell' issue is to start another shell. That's what windows are for (and screen), and I have a pretty well developed set of tools to make new shells cheap and easy; in my opinion, multiple windows are the best and most flexible form of multiplexing. I do sometimes preemptively clone a new window before I run a command in the foreground, and I'll admit that there are occasions when I start something without backgrounding it when I really should have done otherwise. A classical case is running 'emacs file' (or some other GUI program) for what I initially think is going to be a quick use and then realizing that I want to keep that emacs running while getting my shell back.

(This is where my habit of using vim in a terminal is relevant, since that takes over the terminal anyways. I can't gracefully multiplex such a terminal between, say, vim and make; I really want two terminals no matter what.)

So far I can't think of any occasions where I've stuck a command into the background and then wanted it to be in the foreground instead. I tend not to put things in the background very much to start with, and when I do they're things like GNU Emacs or GUI programs that I can already interact with in other ways. Perhaps I'm missing something, but in general I feel that my environment is pretty good at multiplexing things outside of job control.

(At the same time, if someone added job control to my shell of choice, I wouldn't turn my nose up at it. It just seems rather unlikely at this point, and I'm not interested in switching shells to get job control.)

Sidebar: multiplexing and context

One of the things that I like about using separate windows instead of multiplexing several things through one shell is that separate windows clearly preserve and display the context for each separate thing I'm doing. I don't have to rebuild my memory of what a command is doing (and what I'm doing with it) when I foreground it again; that context is right there, and stays right there even if I wind up doing multiple commands instead of just one.

(Screen sessions are somewhat less good at this than terminal windows, because scrollback is generally more awkward. Context usually doesn't fit in a single screen.)

PS: the context is not necessarily just in what's displayed, it's also in things like my history of commands. With separate windows, each shell's command history is independent and so is for a single context; I don't have commands from multiple contexts mingled together. But I'm starting to get into waving my hands a lot, so I'll stop here.

unix/LivingWithoutJobControl written at 23:20:33; Add Comment

A surprising benefit of command/program completion in my shell

I've recently been experimenting with a variant of my usual shell that extends its general (filename) completion to also specifically complete program names from your $PATH. Of course this is nothing new in general in shells; most shells that have readline style completion at all have added command completion as well. But it's new to me, so the experience has been interesting.

Of course the obvious benefit of command completion is that it makes it less of a pain to deal with long command names. In the old days this wasn't issue because Unix didn't have very many long command names, but those days are long over by now. There are still a few big new things that have short names, such as git and go, but many other programs and systems give themselves increasingly long and annoying binary names. Of course you can give regularly used programs short aliases via symlinks or cover scripts, but that's only really worth it in some cases. Program completion covers everything.

(An obvious offender here is Google Chrome, which has the bland name of google-chrome or even google-chrome-stable. I have an alias or two for that.)

But command completion turned out to have a much more surprising benefit for me: it's removed a lot of guesswork about what exactly a program is called, especially for my own little scripts and programs. If I use a program regularly I remember its full name, but if I don't I used to have to play a little game of 'did I call it decodehdr or decodehdrs or decode-hdr?'. Provided that I can remember the start of the command, and I usually can, the shell will now at least guide me to the rest of it and maybe just fill it in directly (it depends on whether the starting bit uniquely identifies the command).

One of the interesting consequences of this is that I suspect I'm going to wind up changing how I name my own little scripts. I used to prioritize short names, because I had to type the whole thing and I don't like typing long names. But with command completion, it's probably better to prioritize a memorable, unique prefix that's not too long and then a tail that makes the command's purpose obvious. Calling something dch might have previously been a good name (although not for something I used infrequently), but now I suspect that names like 'decode-mail-header' are going to be more appealing.

(I'll have to see, and the experiment is a little bit precarious anyways so it may not last forever. But I'll be sad to be without command completion if it goes.)

unix/CommandCompletionBenefit written at 01:56:35; Add Comment


You probably want to start using the -w option with iptables

The other day, I got notified that my office workstation had an exposed portmapper service. That was frankly weird, because while I had rpcbind running for some NFS experiments, I'd carefully used iptables to block almost all access to it. Or at least I thought I had; when I looked at 'iptables -vnL INPUT', my blocks on tcp:111 were conspicuously missing (although it did have the explicit allow rules for the good traffic). So I went through systemd's logs from when my own service for installing all of my IP security rules was starting up and, well:

Sep 22 10:14:30 <host> blocklist[1834]: Another app is currently holding the xtables lock. Perhaps you want to use the -w option?

I have mixed feelings about this message. On the one hand, it's convenient when programs tell you exactly how they've made your life harder. On the other hand, it's nicer if they don't make your life harder in the first place.

So, the short version of what went wrong is that (modern) Linux iptables only allows one process to be playing around with iptables at any given time. If this happens to you, by default iptables just errors out, printing a helpful message about how it knows what you probably want to do but it's not going to do it because of reasons (I'm sure they're good reasons, honest).

(It also applies to ip6tables, and it appears that iptables and ip6tables share the same lock. The lock is global, not per-chain or per-table or anything.)

Now, you might think that I was foolishly running two sets of iptables commands at the same time. It turns out that I probably was, but it's not obvious, so let's follow along. According to the logs, the other thing happening at this point during boot was that my IKE daemon was starting. It was starting in parallel because this is a Fedora machine, which means systemd, and systemd likes to do things in parallel whenever it can (which in practice means whenever you don't prevent it from doing so). As part of starting up, the Fedora ipsec.service has:

# Check for nflog setup
ExecStartPre=/usr/sbin/ipsec --checknflog

This exists to either set up or disable 'iptables rules for the nflog devices', and it's implemented in the ipsec shell script by running various iptables commands. Even if you don't have any nflog settings in your ipsec.conf and there aren't any devices configured, ipsec runs at least one iptables command to verify this. This takes the lock, which collided with my own IP security setup scripts.

(If you guessed that the ipsec script does not use 'iptables -w', you win a no-prize. From casual inspection, the script just assumes that all iptables commands work all the time, so it isn't at all prepared for them to fail due to locking problems.)

This particular iptables change seems to have been added in 2013, in this commit (via). Either many projects haven't noticed or many projects have the problem that they need to be portable to iptables versions that don't have a -w argument and so will fail completely if you try to use 'iptables -w'. I suspect it's a bit of both, honestly.

(Of the supported Linux versions that we still use, Ubuntu 12.04 LTS and RHEL/CentOS 6 don't have 'iptables -w'. Ubuntu 14.04 and RHEL 7 have it.)

PS: My solution was to serialize IKE IPSec startup so that it was forced to happen after my IP security stuff had finished; this was straightforward with a systemd override via 'systemctl edit ipsec.service'. I also went through my own stuff to add '-w' to all of my iptables invocations, because it can't hurt and it somewhat protects me against any other instances of this.

linux/IptablesUseWOption written at 23:55:02; Add Comment

Git's selective commits plus Magit are a killer feature for me

I'm sure there are some people who are meticulously organized in their programming work. They work on only one thing at a time, or if they're making multiple changes they carefully separate them on different topic branches. I'm not one of them. I'm working away on something, but then I can't resist improving something that I stumble over because I was in that area of the code, and I run into a bug that needs to be corrected, and by the time I turn around there's a bunch of unrelated changes all piled in together.

I think git has had 'git add -p' for as long as I've been using it, but in practice it was never usable enough for me. I couldn't stand the tedious grind needed and I made mistakes and sometimes the changes I wanted to separated were in what git considered one chunk. I made a couple of dutiful attempts to use it and make those proper neat commits but soon gave it up as too much of a pain. If I was lucky my unrelated changes were in separate files and I'd make multiple commits; otherwise, well, there were big commits with big change lists and sometimes casual admissions in asides that I'd also done some additional work.

When I first started with Magit I didn't really expect anything much to change with me and git. Sure, I was picking Magit up to make selective commits easier and it did that, but I didn't think that was all that big. After all, most of my separate changes were already to separate files; I at least remembered the really entangled changes as rare.

I was wrong. Easy selective commits have turned into a killer feature of the git plus Magit combination, and I've wound up making them all the time. I think part of it is that having them available is by itself liberating, and encourages me to make most changes freely. I know that I can sort everything out later, that I can let changes mature and evolve at different rates, and so on. So any time I see something I want to tweak or fix or correct, I can do it right then and there. So I do.

(This is especially useful for small single-file programs, such as this one I've been working on recently. Almost all of the commits to it are entangled ones that I sorted out with Magit.)

Sidebar: Magit's taking over making almost all of my commits

Why is pretty straightforward; it's a nicer environment than moving between multiple xterm windows to check diffs, stage changes, make a commit, call up an editor and a spell checker, and so on. Magit has conveniences like warning me if the first line of the commit is getting too long and it inherits all of the straightforward GNU Emacs features like on the fly spellchecking with flyspell mode (once I looked up and worked out how to add it to my Emacs setup in a way that worked). So I wind up writing better commit messages (or at least better spelled ones) with generally less work.

As someone who thinks of himself as a Unix person, I'm not entirely copacetic about GNU Emacs swallowing more of my command line work. But these days I'm a pragmatist too, and Magit and GNU Emacs sure are convenient here. I've even wound up using Magit for some commit amending just because it was easy enough (and easier to look up in the Magit manual than wrestle with figuring out the exact command line incantation I was going to need).

programming/GitSelectiveCommitWithMagit written at 01:24:33; Add Comment


Why we've wound up without ZFS ZILs or L2ARCs on our pools

Back when we designed the current generation of our ZFS fileservers, we expected to wind up putting in at least some ZILs (mirrored and in the backends) and L2ARCs (in the OmniOS fileservers). This has not wound up happening, as all of our plans for this have basically fallen through. There are a number of reasons for this (independent of my thoughts on why an L2ARC probably isn't a good fit for us, which sort of came later).

In one sense, the biggest reason is good news: we haven't felt the need to work on adding them because fileserver performance doesn't obviously suck. The fileservers work well and there's no clear bottleneck to their performance. But this is also kind of bad news. Performance could probably be better with a ZIL or L2ARC, at least for some pools, but there's no simple, easy, and especially always there way of seeing how much improvement there might be. You can gather ZIL usage information with DTrace scripts, but you have to actually go out and do it (and you have to figure out what metrics are important). As far as I know there are no kstats that track things like ZIL commits, volume written to the ZIL, and so on; without using DTrace, you really don't have any idea how active your ZIL is for a pool.

The other big reason is that there are a lot of practical questions about what happens when things go wrong and ZFS doesn't currently document clear answers to them. Before we could add either a ZIL or a L2ARC to a production pool, we'd need to test all of these things, and there's a daunting list of failure scenarios to test (L2ARC goes away during operation, L2ARC not present on reboot, and so on and so forth). Building a test environment and grinding through all of these is a lot of work to undertake when we don't even have a clear need established. And of course we'd also have to test a ZIL (or a L2ARC) in normal usage, just to make sure it didn't have any adverse consequences and actually did deliver the benefits we expected.

(We'd also inevitably need to change and update our management tools and our monitoring systems, including our spares system.)

At a practical level, we've actually dealt with the most important, clearest, and easiest cases of 'we need high performance here' by building out a couple of all-SSD pools. These hold /var/mail and some other core system filesystems, and much of the disk space for the departmental administrative staff (who are the heartbeat of the department and do everything over Samba from managed Windows machines; historically they can really create load).

So after all the dust has settled, it's simply been easier to keep on going without either ZILs or L2ARCs. We don't obviously need them, they may not do us any real good if we actually deployed them in our environment, and they require work to investigate and to deploy. At this point it seems likely that we'll remain without them for the remaining lifetime of this fileserver generation (which I hope starts running out in 2018, but we'll see). It's also my deep hope that the next generation of fileservers will be built around all-SSD storage, which will render many of these issues moot.

solaris/ZFSWhyNoZILOrL2ARCForUse written at 00:04:37; Add Comment


Today I learned that you want to use strace -fp

There I was, with an Ubuntu 16.04 system where rsyslogd seemed to have stopped writing to an important syslog file for no clear reason. When in doubt (or in a hurry, or both), I reach for a big hammer:

# strace -p $(pidof rsyslogd)
strace: Process 21950 attached
select(1, NULL, NULL, NULL, {342, 910468}

And there rsyslogd sat even while I sent in syslog messages, not coming out of that select(). Since this was a 16.04 machine, clearly this was the perfidious work of systemd and its journal, right? Especially since I could see the syslog messages in 'journalctl -f'. I even wrote some angry tweets to that effect.

(People reading this may spot an obvious clue here that I missed at the time when I was rushing around and feeling under pressure, angry, and a bit panicked.)

Well, I was wrong. I was being fooled by a dubious strace default, namely that if you use strace -p PID on a multi-threaded program, it silently traces only one thread. How do you know whether or not a program is a multi-threaded one? Well, strace certainly won't tell you here. One way is to look in /proc/PID/task to see if there's more than one thing there.

What you really want is 'strace -f -p PID'. As the manual page says:

[...] Note that -p PID -f will attach to all threads of process PID if it is multi-threaded, not only the thread with thread_id = PID.

Of course -f has other effects that you may not want if the process forks and execs other programs (or simply forks independent children), but hopefully you know this in advance. Also, hopefully threaded programs that fork and exec are not all that common. If you absolutely have to avoid using -f due to its other effects, I believe that you can strace -p each thread separately using the PIDs you'll find in /proc/PID/task. This isn't going to work if the process as a whole starts new threads, though, since you'll have to manually start strace'ing those too.

So that's the story of how I learned that I probably now want to use strace's -f argument along with -p, unless I have a strong reason not to. These days you can't necessarily predict what's going to be multi-threaded, and probably more and more things are going to be so.

PS: The clue in strace's output that I missed is that the select() is not actually paying attention to any file descriptors, and so is being done just for the timeout. Unless rsyslogd was terribly badly broken and outright not listening for incoming syslog messages, there had to be something else in a select() or read() or the like on rsyslogd's server socket. Ie, there had to be another thread. But I didn't pay attention to the actual select() arguments because I didn't expect them to be a clue to what was going on; I didn't think rsyslogd would be that broken and the failure I was interested in was 'why aren't messages getting written to this configured file'.

(There's a lesson here about debugging.)

linux/StraceUseFWithP written at 23:56:57; Add Comment


A surprise with switching to holding keys in ssh-agent

Every so often I want to transfer a root-only file from my office workstation off to another machine for analysis or the like (the reasons this is necessary are complex). So every so often I wind up doing this:

$ /bin/su
# scp /some/file cks@server:/tmp/foobar
cks@server's password: [...]

Except that I lied there. That password prompt is certainly what used to happen and it's what happens when I do this same operation from any of our servers, but on my office workstation the scp just works without any password challenge. The first time that this happened I was surprised for a bit, then I worked out what was happening.

What's happened is that I switched to holding my SSH keys in ssh-agent instead of having them sitting in $HOME/.ssh. Su'ing to root does not clear the environment variables that tell commands how to talk to my ssh-agent process and of course root has the permissions necessary to access the SSH agent authentication socket, so the root-run scp sees that it has a SSH agent available and uses it. Voila, passwordless access for root to my remote account. This doesn't happen on our servers because I don't forward my SSH agent to my account on our servers (I consider it too dangerous).

Of course root had just as much access to my keys back in the days of having them sitting unencrypted in $HOME/.ssh. The difference is that su'ing to root changes $HOME, so scp, ssh, and so on didn't look at ~cks/.ssh et al, they looked at ~root/.ssh and the latter didn't have my keys (or the SSH configuration that would have told SSH how to use the keys). It's the combination of using a SSH agent and su passing through the environment variables that make SSH programs to talk to it that leads to this particular result.

Also, this is specific to habitually using su instead of sudo. By default, sudo preserves only a relatively few environment variables and removes everything else, and the SSH agent environment variables aren't among the environment variables that make it through. Su is from an older era and so generally defaults to preserving almost everything (for good or bad, take your pick).

(Since sudo passes through things like $XAUTHORITY and $DISPLAY, arguably it should also pass through the SSH agent environment variables. But it doesn't now and I expect that it's unlikely to ever change the default; regardless of any merits of a change, there are too many arguments that anti-change people could muster here.)

sysadmin/SSHAgentPermissionSurprise written at 23:28:40; Add Comment

My view on spam and potential denial of service attacks on anti-spam systems

In a comment on yesterday's entry on a shift in malware packaging, Jinks asked a very good question:

Since you're working with inspecting zip files, how does your setup handle denial of service attacks against the unzipping part?

Like, let's say I sent you a zipped 30TB sparse file. Or a neverending zip quine.

I've found that many commercial solutions can easily choke on maliciously crafted zip files. Do you have any special provisions in your scripts to prevent these attacks?

The straightforward answer is that we're protected against ZIP quine attacks by having very simple code that only goes one level deep in nested ZIP archives, in large part because Python's zipfile module makes this the relatively natural way to write this code. We're semi-protected against ballooning ZIP archives because we only try to expand .zip files.

But this is a copout. The real answer is that I haven't bothered to write the code to defend against such denial of service attacks, in large part because we don't need it now and I don't expect to ever need it in the future. If I was writing code that would defend a large, prominent organization like Google or Hotmail or something, I would say that I absolutely had to engineer in DOS protection from the start because sooner or later some joker would try sending us one just to see what would happen. But deliberate, focused DOSs against my particular little section of the Internet are unlikely (and if they happen we have plenty of softer targets lying around than the mail system; sometimes such DOSes even happens naturally).

But what about spam, malware, and ransomware, you may ask? Surely they may DOS us this way at some point. My view is that if they do, it will be by accident. If you think about it, spam in general has an extremely good motive to avoid DOS'ing people's email systems. To wit, a DOS'd email system is one that is not letting the spam, malware, or ransomware through (it's not letting anything else through either, but the spammer doesn't care about who else is being affected). If a spammer finds some sort of mail that makes a popular anti-spam system choke, I expect them to carefully avoid it for the same overall reason that they avoid sending things that are easily scored as spam.

In other words, what's valuable to a spammer isn't email that causes a DOS, it's email that bypasses filtering systems somehow. It's possible that such email will DOS our systems, but right now I rate that as relatively unlikely; the stuff that would give my filtering system heartburn are mostly things that would probably lock up an anti-spam system instead of bypass it.

(Maybe I will add an inner-zip length restriction, though. It's probably not hard, and there's good reason to skip trying to check out an inner zip that's too large.)

PS: I suspect that commercial solutions are often not robust against these things for the same reason my code isn't; namely, it's just not something that comes up in the wild. Arguably they should do better, since they're general purpose commercial software.

spam/SpamVsDOSAttacks written at 01:02:45; Add Comment


A little shift in malware packaging that I got to watch

When we started rejecting email with certain sorts of malware in it, almost all of the malware (really ransomware) had a pretty consistent signature; it came as a ZIP archive (rarely a RAR archive) with a single bad file type in it. We could easily write a narrowly tailored rule that rejected an archive with a single .js, .jse, .wsf, and so on file in it. Even when we didn't have such a rule ourselves, it seems that our commercial anti-spam system probably had one itself and so rejected the message.

Of course, nothing stands still in the malware world. A bit later, we saw some ransomware send messages that had two .js files in them (or at least I assume it was ransomware). I extended our rejection rules to reject these too and didn't think much of it; at the time it just seemed like one of the random things that spam and malware and ransomware is always doing.

Fast forward to this past Thursday, when we got hit by a small blizzard of ransomware that was still a single bad file type in a ZIP but this time it was throwing in an extra file. What made the extra file stand out is that the ransomware wasn't giving it any sort of file extension. Based on some temporary additional logging (and a sample or two that I caught), the file names are basic, made up, and actually pretty obviously suspicious; I saw one that was a single letter and another that was entirely some number of spaces.

I assume that this evolution is happening because malware authors have noticed that anti-spam software has latched on to the rather distinctive 'single bad file in ZIP' pattern they initially had. I'm not sure why they used such odd (and distinctive, and suspicious) additional filenames, but perhaps the ransomware authors wanted to make it as unlikely as possible that people would get distracted from clicking on the all-important .js or .jse or whatever file.

(I now expect things here to evolve again, although I have no idea where to. Files with more meaningful names? More files? Who knows.)

spam/MalwarePackagingShift written at 01:47:09; Add Comment

(Previous 10 or go back to September 2016 at 2016/09/17)

Page tools: See As Normal.
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.