Wandering Thoughts

2014-04-20

A heresy about memorable passwords

In the wake of Heartbleed, we've been writing some password guidelines at work. A large part of the discussion in them is about how to create memorable passwords. In the process of all of this, I realized that I have a heresy about memorable passwords. I'll put this way:

Memorability is unimportant for any password you use all the time, because you're going to memorize it no matter what it is.

I will tell you a secret: I don't know what my Unix passwords are. Oh, I can type them and I do so often, but I don't know exactly what they are any more. If for some reason I had to recover what one of them was in order to write it out, the fastest way to do so would be to sit down in front of a computer and type it in. Give me just a pen and paper and I'm not sure I could actually do it. My fingers and reflexes know them far better better than my conscious mind.

If you pick a new password based purely at random with absolutely no scheme involved, you'll probably have to write it down on a piece of paper and keep referring to that piece of paper for a while, perhaps a week or so. After the week I'm pretty confidant that you'll be able to shred the piece of paper without any risk at all, except perhaps if you go on vacation for a month and have it fall out of your mind. Even then I wouldn't be surprised if you could type it by reflex when you come back. The truth is that people are very good at pushing repetitive things down into reflex actions, things that we do automatically without much conscious thought. My guess is that short, simple things can remain in conscious memory (this is at least my experience with some things I deal with); longer and more complex things, like a ten character password that involves your hands flying all over the keyboard, those go down into reflexes.

Thus, where memorable passwords really matter is not passwords you use frequently but passwords you use infrequently (and which you're not so worried about that you've seared into your mind anyways).

(Of course, in the real world people may not type their important passwords very often. I try not to think about that very often.)

PS: This neglects threat models entirely, which is a giant morass. But for what it's worth I think we still need to worry about password guessing attacks and so reasonably complex passwords are worth it.

tech/PasswordMemorabilityHeresy written at 02:11:26; Add Comment

2014-04-19

Cross-system NFS locking and unlocking is not necessarily fast

If you're faced with a problem of coordinating reads and writes on an NFS filesystem between several machines, you may be tempted to use NFS locking to communicate between process A (on machine 1) and process B (on machine 2). The attraction of this is that all they have to do is contend for a write lock on a particular file; you don't have to write network communication code and then configure A and B to find each other.

The good news is that this works, in that cross system NFS locking and unlocking actually works right (at least most of the time). The bad news is that this doesn't necessarily work fast. In practice, it can take a fairly significant amount of time for process B on machine 2 to find out that process A on machine 1 has unlocked the coordination file, time that can be measured in tens of seconds. In short, NFS locking works but it can require patience and this makes it not necessarily the best option in cases like this.

(The corollary of this is that when you're testing this part of NFS locking to see if it actually works you need to wait for quite a while before declaring things a failure. Based on my experiences I'd wait at least a minute before declaring an NFS lock to be 'stuck'. Implications for impatient programs with lock timeouts are left as an exercise for the reader.)

I don't know if acquiring an NFS lock on a file after a delay normally causes your machine's kernel to flush cached information about the file. In an ideal world it would, but NFS implementations are often not ideal worlds and the NFS locking protocol is a sidecar thing that's not necessarily closely integrated with the NFS client. Certainly I wouldn't count on NFS locking to flush cached information on, say, the directory that the locked file is in.

In short: you want to test this stuff if you need it.

PS: Possibly this is obvious but when I started testing NFS locking to make sure it worked in our environment I was a little bit surprised by how slow it could be in cross-client cases.

unix/NFSLockingCanBeSlow written at 00:37:21; Add Comment

2014-04-18

What modern filesystems need from volume management

One of the things said about modern filesystems like btrfs and ZFS is that their volume management functionality is a layering violation; this view holds that filesystems should stick to filesystem stuff and volume managers should stick to that. For the moment let's not open that can of worms and just talk about what (theoretical) modern filesystems need from an underlying volume management layer.

Arguably the crucial defining aspect of modern filesystems like ZFS and btrfs is a focus on resilience against disk problems. A modern filesystem no longer trusts disks not to have silent errors; instead it checksums everything so that it can at least detect data faults and it often tries to create some internal resilience by duplicating metadata or at least spreading it around (copy on write is also common, partly because it gives resilience a boost).

In order to make checksums useful for healing data instead of just simply detecting when it's been corrupted, a modern filesystem needs an additional operation from any underlying volume management layer. Since the filesystem can actually identify the correct block from a number of copies, it needs to be able to get all copies or variations of a set of data blocks from the underlying volume manager (and then be able to tell the volume manager which is the correct copy). In mirroring this is straightforward; in RAID 5 and RAID 6 it gets a little more complex. This 'all variants' operation will be used both during regular reads if a corrupt block is detected and during a full verification check where the filesystem will deliberately read every copy to check that they're all intact.

(I'm not sure what the right primitive operation here should be for RAID 5 and RAID 6. On RAID 5 you basically need the ability to try all possible reconstructions of a stripe in order to see which one generates the correct block checksum. Things get even more convoluted if the filesystem level block that you're checksumming spans multiple stripes.)

Modern filesystems generally also want some way of saying 'put A and B on different devices or redundancy clusters' in situations where they're dealing with stripes of things. This enables them to create multiple copies of (important) metadata on different devices for even more protection against read errors. This is not as crucial if the volume manager is already providing redundancy.

This level of volume manager support is a minimum level, as it still leaves a modern filesystem with the RAID-5+ rewrite hole and a potentially inefficient resynchronization process. But it gets you the really important stuff, namely redundancy that will actually help you against disk corruption.

tech/ModernFSAndVolumeManagement written at 02:17:29; Add Comment

2014-04-17

Partly getting around NFS's concurrent write problem

In a comment on my entry about NFS's problem with concurrent writes, a commentator asked this very good question:

So if A writes a file to an NFS directory and B needs to read it "immediately" as the file appears, is the only workaround to use low values of actimeo? Or should A and B be communicating directly with some simple mechanism instead of setting, say, actimeo=1?

(Let's assume that we've got 'close to open' consistency to start with, where A fully writes the file before B processes it.)

If I was faced with this problem and I had a free hand with A and B, I would make A create the file with some non-repeating name and then send an explicit message to B with 'look at file <X>' (using eg a TCP connection between the two). A should probably fsync() the file before it sends this message to make sure that the file's on the server. The goal of this approach is to avoid B's kernel having any cached information about whether or not file <X> might exist (or what the contents of the directory are). With no cached information, B's kernel must go ask the NFS fileserver and thus get accurate information back. I'd want to test this with my actual NFS server and client just to be sure (actual NFS implementations can be endlessly crazy) but I'd expect it to work reliably.

Note that it's important to not reuse filenames. If A ever reuses a filename, B's kernel may have stale information about the old version of the file cached; at the best this will get B a stale filehandle error and at the worst B will read old information from the old version of the file.

If you can't communicate between A and B directly and B operates by scanning the directory to look for new files, you have a moderate caching problem. B's kernel will normally cache information about the contents of the directory for a while and this caching can delay B noticing that there is a new file in the directory. Your only option is to force B's kernel to cache as little as possible. Note that if B is scanning it will presumably only be scanning, say, once a second and so there's always going to be at least a little processing lag (and this processing lag would happen even if A and B were on the same machine); if you really want immediately, you need A to explicitly poke B in some way no matter what.

(I don't think it matters what A's kernel caches about the directory, unless there's communication that runs the other way such as B removing files when it's done with them and A needing to know about this.)

Disclaimer: this is partly theoretical because I've never been trapped in this situation myself. The closest I've come is safely updating files that are read over NFS. See also.

unix/NFSWritePlusReadProblemII written at 00:10:33; Add Comment

2014-04-16

Where I feel that btrfs went wrong

I recently finished reading this LWN series on btrfs, which was the most in-depth exposure at the details of using btrfs that I've had so far. While I'm sure that LWN intended the series to make people enthused about btrfs, I came away with a rather different reaction; I've wound up feeling that btrfs has made a significant misstep along its way that's resulted in a number of design mistakes. To explain why I feel this way I need to contrast it with ZFS.

Btrfs and ZFS are each both volume managers and filesystems merged together. One of the fundamental interface differences between them is that ZFS has decided that it is a volume manager first and a filesystem second, while btrfs has decided that it is a filesystem first and a volume manager second. This is what I see as btrfs's core mistake.

(Overall I've been left with the strong impression that btrfs basically considers volume management to be icky and tries to have as little to do with it as possible. If correct, this is a terrible mistake.)

Since it's a volume manager first, ZFS places volume management front and center in operation. Before you do anything ZFS-related, you need to create a ZFS volume (which ZFS calls a pool); only once this is done do you really start dealing with ZFS filesystems. ZFS even puts the two jobs in two different commands (zpool for pool management, zfs for filesystem management). Because it's firmly made this split, ZFS is free to have filesystem level things such as df present a logical, filesystem based view of things like free space and device usage. If you want the actual physical details you go to the volume management commands.

Because btrfs puts the filesystem first it wedges volume creation in as a side effect of filesystem creation, not a separate activity, and then it carries a series of lies and uselessly physical details through to filesystem level operations like df. Consider the the discussion of what df shows for a RAID1 btrfs filesystem here, which has both a lie (that the filesystem uses only a single physical device) and a needlessly physical view (of the physical block usage and space free on a RAID 1 mirror pair). That btrfs refuses to expose itself as a first class volume manager and pretends that you're dealing with real devices forces it into utterly awkward things like mounting a multi-device btrfs filesystem with 'mount /dev/adevice /mnt'.

I think that this also leads to the asinine design decision that subvolumes have magic flat numeric IDs instead of useful names. Something that's willing to admit it's a volume manager, such as LVM or ZFS, has a name for the volume and can then hang sub-names off that name in a sensible way, even if where those sub-objects appear in the filesystem hierarchy (and under what names) gets shuffled around. But btrfs has no name for the volume to start with and there you go (the filesystem-volume has a mount point, but that's a different thing).

All of this really matters for how easily you can manage and keep track of things. df on ZFS filesystems does not lie to me; it tells me where the filesystem comes from (what pool and what object path within the pool), how much logical space the filesystem is using (more or less), and roughly how much more I can write to it. Since they have full names, ZFS objects such as snapshots can be more or less self documenting if you name them well. With an object hierarchy, ZFS has a natural way to inherit various things from parent object to sub-objects. And so on.

Btrfs's 'I am not a volume manager' approach also leads it to drastically limit the physical shape of a btrfs RAID array in a way that is actually painfully limiting. In ZFS, a pool stripes its data over a number of vdevs and each vdev can be any RAID type with any number of devices. Because ZFS allows multi-way mirrors this creates a straightforward way to create a three-way or four-way RAID 10 array; you just make all of the vdevs be three or four way mirrors. You can also change the mirror count on the fly, which is handy for all sorts of operations. In btrfs, the shape 'raid10' is a top level property of the overall btrfs 'filesystem' and, well, that's all you get. There is no easy place to put in multi-way mirroring; because of btrfs's model of not being a volume manager it would require changes in any number of places.

(And while I'm here, that btrfs requires you to specify both your data and your metadata RAID levels is crazy and gives people a great way to accidentally blow their own foot off.)

As a side note, I believe that btrfs's lack of allocation guarantees in a raid10 setup makes it impossible to create a btrfs filesystem split evenly across two controllers that is guaranteed to survive the loss of one entire controller. In ZFS this is trivial because of the explicit structure of vdevs in the pool.

PS: ZFS is too permissive in how you can assemble vdevs, because there is almost no point of a pool with, say, a mirror vdev plus a RAID-6 vdev. That configuration is all but guaranteed to be a mistake in some way.

linux/BtrfsCoreMistake written at 01:27:57; Add Comment

2014-04-14

Chasing SSL certificate chains to build a chain file

Supposes that you have some shiny new SSL certificates for some reason. These new certificates need a chain of intermediate certificates in order to work with everything, but for some reason you don't have the right set. In ideal circumstances you'll be able to easily find the right intermediate certificates on your SSL CA's website and won't need the rest of this entry.

Okay, let's assume that your SSL CA's website is an unhelpful swamp pit. Fortunately all is not lost, because these days at least some SSL certificates come with the information needed to find the intermediate certificates. First we need to dump out our certificate, following my OpenSSL basics:

openssl x509 -text -noout -in WHAT.crt

This will print out a bunch of information. If you're in luck (or possibly always), down at the bottom there will be a 'Authority Information Access' section with a 'CA Issuers - URI' bit. That is the URL of the next certificate up the chain, so we fetch it:

wget <SOME-URL>.crt

(In case it's not obvious: for this purpose you don't have to worry if this URL is being fetched over HTTP instead of HTTPS. Either your certificate is signed by this public key or it isn't.)

Generally or perhaps always this will not be a plain text file like your certificate is, but instead a binary blob. The plain text format is called PEM; your fetched binary blob of a certificate is probably in the binary DER encoding. To convert from DER to PEM we do:

openssl x509 -inform DER -in <WGOT-FILE>.crt -outform PEM -out intermediate-01.crt

Now you can inspect intermediate-01.crt in the same to see if it needs a further intermediate certificate; if it does, iterate this process. When you have a suitable collection of PEM format intermediate certificates, simply concatenate them together in order (from the first you fetched to the last, per here) to create your chain file.

PS: The Qualys SSL Server Test is a good way to see how correct your certificate chain is. If it reports that it had to download any certificates, your chain of intermediate certificates is not complete. Similarly it may report that some entries in your chain are not necessary, although in practice this rarely hurts.

Sidebar: Browsers and certificate chains

As you might guess, some but not all browsers appear to use this embedded intermediate certificate URL to automatically fetch any necessary intermediate certificates during certificate validation (as mentioned eg here). Relatedly, browsers will probably not tell you about unnecessary intermediate certificates they received from your website. The upshot of this can be a HTTPS website that works in some browsers but fails in others, and in the failing browser it may appear that you sent no additional certificates as part of a certificate chain. Always test with a tool that will tell you the low-level details.

(Doing otherwise can cause a great deal of head scratching and frustration. Don't ask how I came to know this.)

sysadmin/SSLChasingCertChains written at 22:02:04; Add Comment

My reactions to Python's warnings module

A commentator on my entry on the warnings problem pointed out the existence of the warnings module as a possible solution to my issue. I've now played around with it and I don't think it fits my needs here, for two somewhat related reasons.

The first reason is that it simply makes me nervous to use or even take over the same infrastructure that Python itself uses for things like deprecation warnings. Warnings produced about Python code and warnings that my code produces are completely separate things and I don't like mingling them together, partly because they have significantly different needs.

The second reason is that the default formatting that the warnings module uses is completely wrong for the 'warnings produced from my program' case. I want my program warnings to produce standard Unix format (warning) messages and to, for example, not include the Python code snippet that generated them. Based on playing around with the warnings module briefly it's fairly clear that I would have to significantly reformat standard warnings to do what I want. At that point I'm not getting much out of the warnings module itself.

All of this is a sign of a fundamental decision in the warnings module: the warnings module is only designed to produce warnings about Python code. This core design purpose is reflected in many ways throughout the module, such as in the various sorts of filtering it offers and how you can't actually change the output format as far as I can see. I think that this makes it a bad fit for anything except that core purpose.

In short, if I want to log warnings I'm better off using general logging and general log filtering to control what warnings get printed. What features I want there are another entry.

python/WarningsModuleReactions written at 01:19:06; Add Comment

2014-04-13

A problem: handling warnings generated at low levels in your code

Python has a well honed approach for handling errors that happen at a low level in your code; you raise a specific exception and let it bubble up through your program. There's even a pattern for adding more context as you go up through the call stack, where you catch the exception, add more context to it (through one of various ways), and then propagate the exception onwards.

(You can also use things like phase tracking to make error messages more specific. And you may want to catch and re-raise exceptions for other reasons, such as wrapping foreign exceptions.)

All of this is great when it's an error. But what about warnings? I recently ran into a case where I wanted to 'raise' (in the abstract) a warning at a very low level in my code, and that left me completely stymied about what the best way to do it was. The disconnect between errors and warnings is that in most cases errors immediately stop further processing while warnings don't, so you can't deal with warnings by raising an exception; you need to somehow both 'raise' the warning and continue further processing.

I can think of several ways of handling this, all of which I've sort of used in code in the past:

  • Explicitly return warnings as part of the function's output. This is the most straightforward but also sprays warnings through your APIs, which can be a problem if you realize that you've found a need to add warnings to existing code.

  • Have functions accumulate warnings on some global or relatively global object (perhaps hidden through 'record a warning' function calls). Then at the end of processing, high-level code will go through the accumulated warnings and do whatever is desired with them.

  • Log the warnings immediately through a general logging system that you're using for all program messages (ranging from simple to very complex). This has the benefit that both warnings and errors will be produced in the correct order.

The second and third approaches have the problem that it's hard for intermediate layers to add context to warning messages; they'll wind up wanting or needing to pass the context down to the low level routines that generate the warnings. The third approach can have the general options problem when it comes to controlling what warnings are and aren't produced, or you can try to control this by having the high level code configure the logging system to discard some messages.

I don't have any answers here, but I can't help thinking that I'm missing a way of doing this that would make it all easy. Probably logging is the best general approach for this and I should just give in, learn a Python logging system, and use it for everything in the future.

(In the incident that sparked this entry, I wound up punting and just printing out a message with sys.stderr.write() because I wasn't in a mood to significantly restructure the code just because I now wanted to emit a warning.)

python/WarningHandlingProblem written at 02:14:16; Add Comment

2014-04-11

The relationship between SSH, SSL, and the Heartbleed bug

I will lead with the summary: since the Heartbleed bug is a bug in OpenSSL's implementation of a part of the TLS protocol, no version or implementation of SSH is affected by Heartbleed because the SSH protocol is not built on top of TLS.

So, there's four things involved here:

  • SSL aka TLS is the underlying network encryption protocol used for HTTPS and a bunch of other SSL/TLS things. Heartbleed is an error in implementing the 'TLS heartbeat' protocol extension to the TLS protocol. A number of other secure protocols are built partially or completely on top of TLS, such as OpenVPN.

  • SSH is the protocol used for, well, SSH connections. It's completely separate from TLS and is not layered on top of it in any way. However, TLS and SSH both use a common set of cryptography primitives such as Diffie-Hellman key exchange, AES, and SHA1.

    (Anyone sane who's designing a secure protocol reuses these primitives instead of trying to invent their own.)

  • OpenSSL is an implementation of SSL/TLS in the form of a large cryptography library. It also exports a whole bunch of functions and so on that do various cryptography primitives and other lower-level operations that are useful for things doing cryptography in general.

  • OpenSSH is one implementation of the SSH protocol. It uses various functions exported by OpenSSL for a lot of cryptography related things such as generating randomness, but it doesn't use the SSL/TLS portions of OpenSSL because SSH (the protocol) doesn't involve TLS (the protocol).

Low level flaws in OpenSSL such as Debian breaking its randomness can affect OpenSSH when OpenSSH uses something that's affected by the low level flaw. In the case of the Debian issue, OpenSSH gets its random numbers from OpenSSL and so was affected in a number of ways.

High level flaws in OpenSSL's implementation of TLS itself will never affect OpenSSH because OpenSSH simply doesn't use those bits of OpenSSL. For instance, if OpenSSL turns out to have an SSL certificate verification bug (which happened recently with other SSL implementations) it won't affect OpenSSH's SSH user and host key verification.

As a corollary, OpenSSH (and all SSH implementations) aren't directly affected by TLS protocol attacks such as BEAST or Lucky Thirteen, although people may be able to develop similar attacks against SSH using the same general principles.

tech/SSHAndSSLAndHeartbleed written at 23:43:41; Add Comment

What sort of kernel command line arguments Fedora 20's dracut seems to want

Recently I upgraded the kernel on my Fedora 20 office workstation, rebooted the machine, and had it hang in early boot (the first two are routine, the last is not). Forcing a reboot back to the earlier kernel brought things back to life. After a bunch of investigation I discovered that this was not actually due to the new kernel, it was due to an earlier dracut update. So this is the first thing to learn: if a dracut update breaks something in the boot process, you'll probably only discover this the next time you upgrade the kernel and the (new) dracut builds a (new and not working) initramfs for it.

The second thing I discovered in the process of this is the Fedora boot process will wait for a really long time for your root filesystem to appear before giving up, printing messages about it, and giving you an emergency shell, where by a really long time I mean 'many minutes' (I think at least five). It turned out that my boot process had not locked up but instead it was sitting around waiting my root filesystem to appear. Of course this wait was silent, with no warnings or status notes reported on the console, so I thought that things had hung. The reason the boot process couldn't find my root filesystem was that my root filesystem is on software RAID and the new dracut has stopped assembling such things for a bunch of people.

(Fedora apparently considers this new dracut state to be 'working as designed', based on bug reports I've skimmed.)

I don't know exactly what changed between the old dracut and the new dracut, but what I do know is that the new dracut really wants you to explicitly tell it what software RAID devices, LVM devices, or other things to bring up on boot through arguments added to the kernel command line. dracut.cmdline(7) will tell you all about the many options, but the really useful thing to know is that you can get dracut itself to tell you what it wants via 'dracut --print-cmdline'.

For me on my machine, this prints out (and booting wants):

  • three rd.md.uuid=<UUID> settings for the software RAID arrays of my root filesystem, the swap partition, and /boot. I'm not sure why dracut includes /boot but I left it in. The kernel command line is already absurdly over-long on a modern Fedora machine, so whatever.

    (There are similar options for LVM volumes, LUKS, and so on.)

  • a 'root=UUID=<UUID>' stanza to specify the UUID of the root filesystem. It's possible that my old 'root=/dev/mdXX' would have worked (the root's RAID array is assembled with the right name), but I didn't feel like finding out the hard way.

  • rootflags=... and rootfstype=ext4 for more information about mounting the root filesystem.

  • resume=UUID=<UUID>, which points to my swap area. I omitted this in the kernel command line I set in grub.cfg because I never suspend my workstation. Nothing has exploded yet.

The simplest approach to fixing up your machine in a situation like this is probably to just update grub.cfg to add everything dracut wants to the new kernel's command line (removing any existing conflicting options, eg an old root=/dev/XXX setting). I looked into just what the arguments were and omitted one for no particularly good reason.

(I won't say that Dracut is magic, because I'm sure it could all be read up on and decoded if I wanted to. I just think that doing so is not worth bothering with for most people. Modern Linux booting is functionally a black box, partly because it's so complex and partly because it almost always just works.)

linux/DracutNeededArguments written at 02:11:02; Add Comment

These are my WanderingThoughts
(About the blog)

Full index of entries
Recent comments

This is part of CSpace, and is written by ChrisSiebenmann.
Twitter: @thatcks

* * *

Categories: links, linux, programming, python, snark, solaris, spam, sysadmin, tech, unix, web

This is a DWiki.
GettingAround
(Help)

Search:
(Previous 10 or go back to April 2014 at 2014/04/10)

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.