Wandering Thoughts archives

2016-07-28

A bit about what we use DTrace for (and when)

Earlier this year, Byran Cantrill kind of issued a call for people to talk about their DTrace success stories. I do want to write up a blog entry about all of the times we've used DTrace to solve our problems, but it's clearly not happening soon, so for now I want to stop stalling and at least say a bit about the kind of situations we use DTrace for.

Unlike some people, we don't make routine use of DTrace; it's not a part of ongoing system monitoring, for example. Partly this is because our fileservers spend most of their time not having problems. When stuff sits there quietly working, we don't need to pay much attention to it. There's probably useful information that DTrace could gather for us on an ongoing basis, but we just don't use it that way at the moment.

What we do use DTrace for is deep system investigations during problems and crises. Some of this is having scripts available that can do detailed monitoring of areas of interest to us; when an NFS fileserver problem appears, we can start by firing up our existing information collection scripts. A lot of the time we have merely ordinary problems and the scripts will tell us what they are (a slow disk, a user pushing a huge volume of IO, etc). Some of the time we have extraordinary problems and the existing scripts just let us rule things out.

Some of the time we have a new and novel problem, or even a crisis. In these situations we use DTrace to dig deep into the depths of the live kernel and pull out information we probably couldn't get any other way. This tends to be done with ad hoc hacked together scripts instead of anything more carefully developed; as we explore the problem we find questions to ask, write DTrace snippets to give us answers, and iterate this process. Often the questions we're asking (and the answers we're getting) are so specific to the current problem and our suspicions that there's no point in cleaning the resulting scripts up; they're the equivalent of one-off shell scripts and we'll almost certainly never use them again. DTrace is only one of the tools we use in these situations, of course, but it's an extremely valuable one and has let us understand deep issues (although not always solve them).

(Some of the time an ad hoc tool seems useful enough to be turned into something more, even if it turns out that I basically never use it again.)

DTraceOurUsage written at 00:27:24; Add Comment

2016-07-08

Some notes on UID and GID remapping in the Illumos/OmniOS NFS server

As part of looking into this whole issue, I recently wound up reading the current manpages for the OmniOS NFS server and thus discovered that it can remap UIDs and GIDs for clients via the uidmap= and gidmap= NFS share options. Server side NFS ID remapping is not as neat or scalable as client side remapping, but it does solve my particular problem, so I've now played around with it a bit and have some notes.

The feature itself is an Illumos one and is about two years old now, so it's probably been integrated into most Illumos-based releases that you want to use (even though we mostly don't update, you really do want to do so every so often). It's certainly in OmniOS r151014, which is what we use on our fileservers.

The share_nfs manpage describes mappings as [clnt]:[srv]:access_list. Despite its name, the access_list bit is just for matching the client; it doesn't create or change any NFS mount access permissions, which are still set through rw= and so on. You can also use a different mechanism in each place for identifying clients, say a netgroup for filesystem access and then a hostname to identify the client for remapping (which might be handy if using a netgroup has side effects).

The manpage also describes uidmap (and gidmap) as 'remapping the user ID (uid) in the incoming request to some other uid'. This is a completely accurate description in that the server does not remap IDs in replies, such as in the results from stat() system calls. For example, if you remap your client UID to your server UID, 'ls -l' will show that your files are owned by the server-side UID, not you on the client. This is potentially confusing in general and will probably cause anything that does client-side UID or GID checking to incorrectly reject you.

(This design decision is probably due to the fact that the UID and GID mapping is not necessarily 1:1, either on the server or for that matter on the client. And yes, I can imagine merely somewhat perverse client side uses of mapping a second local UID to a server UID that also exists on the client.)

Note that in general UID remapping is probably more important than GID remapping, since you can always force a purely server side group list (as far as I know this NFS server lookup entirely overwrites the group list from the client).

PS: I don't know how well this scales on any level. Since all of the mappings have to be specified as share options, I expect that this won't really be very pleasant to deal with if you're trying to do a lot of remapping (either many IDs for some clients or many clients with a few ID remappings).

NFSServerUIDRemapping written at 01:27:04; Add Comment

By day for July 2016: 8 28; before July.

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.