Wandering Thoughts archives

2012-05-09

Using rsync to pull a directory tree to client machines

Suppose that you have a decent sized directory tree that you want some number of clients to mirror from a master server (with the clients pulling updates instead of the master pushing them), perhaps because you've just noticed undesired NFS dependencies. Things in the directory tree are potentially sensitive (so you want access control), it's updated at random, and it's not in a giant VCS tree or something; this is your typical medium-sized ball of local stuff. The straightforward brute force approach is to use rsync with SSH; give the clients special SSH identities, put them in the server's authorized_keys, and have them run 'rsync -a --delete' (or some close variant) to pull the directory tree over. However, this has the problem that normal rsync is symmetric; if you allow a client to pull from you, you also allow a client to push to you (assuming that the server side login has write access to the directory tree, and yes let's make that assumption for now).

(You also have to set the SSH access up so that the clients can't run arbitrary commands on the server.)

Rsync's solution to this is its daemon mode, which can restricted to operate in read only mode. Normally rsync wants to be run this way as an actual daemon (listening on a port and so on), but that requires us to use rsync's weaker and harder to manage authentication, access control, and other things. I would rather continue to run daemon mode rsync over plain SSH and take advantage of all of the existing, proven SSH features for various things.

(The rsync manpage suggests hacks like binding the rsync daemon to only listen on localhost on the server and then using SSH port forwarding to give clients access to it. But those are hacks and require making various assumptions.)

How to to do this is not obvious from the documentation, so here is the setup I have come up with for doing this on both the server and the clients. First, you need an rsyncd.conf configuration file on the server. Don't use the normal /etc/rsyncd.conf; it's much more controllable to use your own in a different place. It should look something like:

use chroot = no
[somepath]
comment = Replication module
path = /some/path
read only = true
# if necessary:
uid = 0
gid = 0

(The '[somepath]' bit is what rsync calls the module name and can be anything meaningful for you; you'll need it on the client later. The comment is optional but potentially useful. You need to explicitly specify uid and gid if the server login is UID 0 for access to the directory tree and you need to keep that; otherwise rsync will drop privileges to a default UID.)

Next, you need a script on the server that will force an incoming SSH login to run rsync in daemon mode against this configuration file and do nothing else. We will set this as the command= value in the server login's authorized_keys to restrict what the incoming SSH connection from clients can do. This looks like:

#!/bin/sh
exec /usr/bin/rsync --server --daemon --config=/your/rsyncd.conf .

Note that this completely ignores any arguments that the client attempts to supply. However, this doesn't matter; as far as I can tell, the command line that the clients send will always be 'rsync --server --daemon .', regardless of what command line options and paths you use on the clients. (Certainly this is the only command line that clients seem to send for requests that you actually want to pay attention to.)

On the server, the login that you're using for this should have a .ssh/authorized_keys file with entries for the client SSH identities. These entries should all force incoming logins to run the command above and block various other activities (especially port forwarding, which could otherwise be done without command execution at all as Dan Astoorian mentioned in a comment here):

command="/your/rsyncd-shell",no-port-forwarding,no-X11-forwarding,no-agent-forwarding,no-pty [...]

A from="..." restriction is optional but potentially recommended. Even a broad one may limit the fallout from problems.

Finally, on the client you need to run rsync with all of the necessary arguments. You probably want to put this in a script:

#!/bin/sh
rsync -a --delete --rsh="/usr/bin/ssh -i /client/identity" LOGIN@MASTER-HOST::somepath /some/path/

Potentially useful additional arguments for rsync are -q and --timeout=<something>. In a production script you probably also want an option to mirror the directory tree to somewhere other than /some/path on the client.

If you run this from cron, remember to add some locking to prevent two copies from running at once. If the directory tree is large and you have enough clients, you may want to add some amount of randomization of the start times for the replication in order to keep load down on the master server.

(There may be a better way to do this with rsync; if you know of one, let me know in the comments. For various reasons we're probably not interested in doing this with any other tool, partly because we already have rsync and not the other tools. Another tool would have to be very much better than rsync to really be worth switching to.)

sysadmin/RsyncReplicationSetup written at 23:54:57; Add Comment

Things I will do differently in the next building power shutdown (part 2)

Back at the start of last September, we had an overnight building wide power shutdown in the building with our machine room and I wrote a lessons-learned entry in the aftermath. Well, we just had another one and apparently I didn't learn all of the lessons that I needed to learn the first time around. So here's another set of things that I've now learned.

Next time around I will:

  • explicitly save the previous time's checklist. If nothing else, the 'power up' portion makes a handy guide for what to do if you abruptly lose building power some day.

    (I sort of did this last time, not through active planning but just because I reflexively don't delete basically any of this sort of stuff. But I should do it deliberately and put it somewhere where I can easily find it, instead of just leaving it lying around.)

    Having last time's list isn't the end of the work, because things have undoubtedly changed since then. But it's a starting point and a jog to the memory.

  • start preparing the checklist well in advance, like more than a day beforehand. Things worked out in the end but doing things at the last moment was a bit nerve wracking.

    (There's always stuff to do around here and somehow it always felt like there was plenty of time right up until it was Friday and we had a Monday night shutdown.)

  • update and correct the checklist immediately afterwards to cover things that we missed. My entry from last time is kind of vague; I'm sure I knew the specifics I was thinking of at the time, but I didn't write them down so they slipped away. I was able to reconstruct a few things from notes and email in the wake of last time, but others I only realized in the aftermath of this one.

  • add explanatory notes about why things are being done in a certain order and what the dependencies are. Especially in the bustle of trying to get everything down or up as fast as possible, it's useful to have something to jog our minds about why something is the way it is and whether or not it's that important.

    (Our checklists for this sort of thing are not fixed; they're more guidelines than requirements. We deviate from them on the fly and thus it's really useful to have some indication of how flexible or rigid things are.)

  • if any machines are being brought down and then deliberately not being brought back up, explicitly mention this so that people don't get potentially confused about a 'missing' machine.

My entry from last time was very useful in several ways. I reread it when I was preparing our checklist for this time and it jogged my memory about several important issues; as a result our checklist for this time around was (I think) significantly better than for last time (and also noticeably longer and more verbose). This time I at least made new mistakes, which is progress that I can live with.

I will also probably try to put more explanation into the checklist the next time around. I'm sure it's possible to put too much of it in, but I don't think that's been our problem so far. In the heat of the moment we're going to skim anyways, so the thing to do is to break the checklist up into skimmable blocks with actions and things to check off and then chunks of additional explanation after them.

(In a sense a checklist like this serves two purposes at once. During the power down or power up it is mostly a catalog of actions and ordering, but beforehand it's a discussion and a rationale for what needs to be done and why. Without the logic behind it being written out explicitly, you can't have that discussion; once you have that logic written out, you might as well leave it in to jog people's memories on the spot.)

On a side note, a full power up is an interesting and useful way to find problematic dependencies that have quietly worked their way into your overall network, ones that are not so noticeable when your systems are in their normal steady state. For example, DHCP service for several of our networks now depends on our core fileserver, which means that it can only come up fairly late in the power up process. We're going to be fixing that.

(There is a chain of dependencies that made this make sense in a steady state environment.)

sysadmin/PowerdownLessonsLearnedII written at 00:37:34; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.