Wandering Thoughts archives

2016-08-22

An interesting case of NFS traffic (probably) holding a ZFS snapshot busy

We have a few filesystems on our fileservers that are considered sufficiently important that we take hourly snapshots during the working day. We use a simple naming and expiry scheme for these snapshots, where they're called <Day>-<Hour> (eg Tue-15) and the script simply deletes any old version before creating the new one. Both because it's the default and because it enables self-serve restores, we NFS-export the ZFS snapshots as well as the main filesystem. Recently that script threw up an error:

cannot destroy snapshot POOL/h/NNN@Mon-16: dataset is busy
cannot create snapshot 'POOL/h/NNN@Mon-16': dataset already exists

We believe that this ultimately happened because an hour or so two beforehand, a runaway IMAP process was traversing its way through that ZFS snapshot via the NFS export. The runaway IMAP process had been terminated well before this, but that might not have mattered enough; an NFS server doesn't know when a NFS client is done with the filehandles it has requested, so the server needs to guess and it may well guess conservatively (saying, for example, 'if I still have them in my server side cache, they're not old enough yet').

This was several weeks ago and the snapshot in question was quietly recycled a week later without any problems, so this did go away after a while. I can't even definitely say that past NFS activity in the snapshot was the problem; we haven't tried to reproduce it, and unfortunately as far as I know OmniOS lacks tools to give us visibility into this sort of thing (fuser reported nothing for the snapshot, for example, which is not surprising; there was no user-level activity on the fileserver that involved the snapshot).

This instance wasn't urgent and went away on its own. I'm not sure what we'd do if these weren't the case, because I don't know if there's any good ways of pushing the kernel to give up things like old(er) NFS filehandles and so on. Shutting down NFS service or rebooting the fileserver would probably do it, but both are rather drastic steps.

(It may be possible to write some DTrace to give us more information about why a dataset is still busy. Or, since DTrace is not always the answer to everything, possibly mdb can give us results too.)

solaris/ZFSSnapshotsNFSBusyProblem written at 00:33:48; Add Comment

2016-08-21

My pragmatic decision on GNU Emacs versus vim for my programming

One of the reasons I've been thinking about vim lately and working on learning it more is that I've been flirting with the idea of switching to using vim for all of my programming in order to focus all of my attention on one editor instead of splitting it across vim and GNU Emacs as I nominally claim to do. The reality is that I already spend most of my editing time in vim, because these days I don't do much programming (especially in the languages I use GNU Emacs for). Given an increasing use of vim and thus increasing fluency in it, and low GNU Emacs use (with a slow loss of fluency), I thought it might make sense to just go all in on vim. Vim has all of the basic features that I need to be productive (like windows on multiple buffers aka files), and it also has its own well developed form of super-intelligence (and plenty of people who like it a lot).

I'll put the conclusion up front: I'm not going to do that. I've decided it makes more sense to stick with GNU Emacs as my super-intelligent editing environment (and maybe get a bit better at it every so often).

What ultimately changed my mind today was the experience of experimenting with GNU Emacs' flycheck (and go-flymake's addon for it). Specifically, that the whole exercise took me only a couple of minutes to get going and everything basically just worked. I'm sure that there's an equivalent plugin setup for vim and an experienced vim person could get it up and running in no time flat, but I'm not that person. For better or worse, GNU Emacs has worked out a whole complex ecology of ELPA and MELPA and then buried all of the complexity so that it pretty much just works for people like me. I'm a lazy and pragmatic person these days (eg), and for all my agonizing and contemplating, I still know enough GNU Emacs to be productive and GNU Emacs makes it easy for me to just get code written in a sophisticated environment with a lot of niceties that generally just work.

(I don't know enough about the world of vim plugins to know if super-intelligent stuff is more likely to appear for GNU Emacs than for vim, and of course my current impressions are biased by the fact that MELPA seems to have this massive list of everything)

This isn't to say that getting code written is hard in vim. With work I could probably assemble a vim environment full of equivalents of magit and company-mode and so on, I like vi's overall approach, and I'm going to reach the point where I'm better at editing in vim than in GNU Emacs. But since both GNU Emacs and vim are quite capable editors and I already have a good GNU Emacs environment that I find easy enough to do things in, it seems unlikely that switching exclusively to vim would make a huge difference, especially given that I don't write code in GNU Emacs all that often (cf Amdahl's Law). Instead it seems more likely that I'd spend a lot of time churning around and wind up more or less in the same spot, except using vim commands instead of GNU Emacs ones. That's not enough of a win to be tempting, not any more.

(To head off the obvious suggestion, for various reasons I'm not interested in trying to use vi keystrokes in GNU Emacs. If I'm going to be using GNU Emacs, I have a lot of experience and reflexes built around its native key bindings.)

There's a part of me that regrets this (the same part that likes the idea of Rust). It would quite like to embark on the grand (and periodically frustrating) adventure of (re)building a sophisticated editing environment in vim, learning all about vim plugins, and so on, and even now it's busy trying to convince me that I'm making a mistake and I'm only going to frustrate myself by continuing to go back and forth between vim and GNU Emacs instead of mastering vim (and finding cool plugins for it). The rest of me has other things to do.

(And I admit that I still like GNU Emacs, and not just because you can put the kitchen sink into it. I've edited a lot of code (and text) in GNU Emacs over the years and in the process I've gotten quite used to it. I didn't drift away from it because I dislike it, I drifted away because it doesn't make for a good sysadmin's editor.)

programming/CodeEditingVimVsEmacs written at 00:38:56; Add Comment

2016-08-19

My current Go autocompletion setup in GNU Emacs

A while back I wrote about getting gocode based autocompletion working in GNU Emacs, and then about things about the autocompletion that I was unsatisfied with. After a certain amount of struggling I actually wound up with a Go autocompletion setup that I'm happy with, and I've now realized that I never wrote it up, and there is one change that wound up quite important for me.

GNU Emacs has at least two auto-completion systems, auto-complete and company-mode. In my original approach I followed gocode's recommendation and used auto-complete. I have now switched to company-mode and I like this much better, in large part because I've been able to do much more customization of how it behaves than I managed with auto-complete.

My first set of customizations are more or less gocode's recommended ones. I enable on company-mode only in Go code, and I copied many of the possible improvements it lists:

(setq company-tooltip-limit 15)
(setq company-idle-delay .15)
(setq company-echo-delay 0)
(setq company-begin-commands '(self-insert-command))

This still left my general issues with autocompletion, so I got out a big hammer and started changing company-mode's key bindings so that they didn't steal keys from me in a way that irritated me.

; complete selections with Control-Return, not
; normal Return.
(define-key company-active-map (kbd "RET") nil)
(define-key company-active-map (kbd "<return>") nil)
(define-key company-active-map (kbd "C-<return>") 'company-complete-selection)

; Use C-Up/Down to move through the selection list
(define-key company-active-map (kbd "<down>") nil)
(define-key company-active-map (kbd "<up>") nil)
(define-key company-active-map (kbd "C-<down>") 'company-select-next-or-abort)
(define-key company-active-map (kbd "C-<up>") 'company-select-previous-or-abort)

The result of these bindings is that company-mode is relatively inoffensive. In practice, often it's enough that it tells me what's available and I wind up just typing out the function or variable name myself. But at least some of the time I'll wind up using C-return once I've typed enough to make the name unambiguous. This means that I'm not making much use of its power, but at least it's inoffensive and helpful.

(Actively looking at company-mode in the process of writing this entry has suggested a number of things I should remember about it, like that TAB will complete as much as possible, up to and including everything.)

Overall my experience with company-mode in Go has been pleasant. It doesn't seem to hurt and it's non-distracting enough and potentially helpful enough that I keep it around. It's even reached the point where I'm a bit annoying when it doesn't offer suggestions for various reasons, such as that I haven't yet run goimports to actually add an import for the new package that I'm using and that I want names from.

(This is a kind of chicken and egg situation. I need to write something that uses the package before goimports will auto-add it for me, but I need it added before company-mode will start conveniently completing names so that I can use the package. The real answer turns out to be that I should remember the go-mode C-c C-a binding to add an import by hand.)

One of the things that I have wound up quite liking about company-mode is simply that I can modify it like this. Possibly auto-complete can be modified in this way too, but it was easy enough for me to work out how to do it to company-mode even as a semi-outsider to GNU Emacs mode hacking.

(I have some additional bindings I consider experimental, which I've decided not to mention any specifics about. That it's easy enough to bind things that I do have experimental bindings is nice.)

programming/GoGocodeEmacsAutocompleteII written at 23:25:51; Add Comment

Localhost is (sometimes) a network

Everyone knows localhost, 127.0.0.1, the IP(v4) address of every machine's loopback network. If you're talking to 127.0.0.1, you're talking to yourself (machine-wise). Many of us (although not all) know that localhost is not just a single IP address; instead, all of 127.*.*.* (aka 127/8) is reserved as the loopback network. ifconfig will tell you about this on most machines:

lo0: flags=[...]
        inet 127.0.0.1 netmask ff000000 

You don't see that many 0's in a netmask all that often.

One of the tricks that you can play here is to give your loopback network more 127.* IP aliases. Why would you want that? Well, suppose that you have two things that both want to run localhost web servers. With 127.* IP aliases, one can be http://127.0.0.1/ and the other can be http://127.0.0.2/, and both can be happy. Localhost IP aliases can also be a convenient way to invent additional source addresses for testing. Need to connect to something on the same machine from ten different source IPs? Just add 127.0.100.1 through 127.0.100.10, then have your software rotate among them as the source addresses. It's all still loopback traffic, so all of the usual guarantees apply; it just has some different IPs involved.

Sometimes, though, we mean that loopback is a network in a more literal way. Specifically, on Linux:

bash
$ ping 127.0.$(($RANDOM % 256)).$(($RANDOM % 256))
PING 127.0.251.170 (127.0.251.170) 56(84) bytes of data.
64 bytes from 127.0.251.170: icmp_seq=1 ttl=64 time=0.055 ms
[...]

On at least FreeBSD, OpenBSD, and OmniOS, this will fail in various ways. On Linux, it works. Through black magic, you don't have to add additional 127.* IP aliases in order to use those addresses; although not explicitly declared anywhere as existing, they are just there, all 16,777,215 or so of them. The various localhost IPs are all distinct from each other as usual, so a service listening on some port on 127.0.0.1 can't be reached at that port on 127.0.0.2. You just don't have to explicitly create 127.0.0.2 before you can start using it, either to have a service listen on it or to have a connection use that as its source address.

(And anything that listens generically can be reached on any 127.* address. You can ssh to these random localhost IPs, for example.)

PS: Before I tested it as part of writing this entry, I thought that the Linux behavior was how all Unixes worked. I was a bit surprised to find otherwise, partly because it makes the Linux behavior (much) more odd. It is kind of convenient, though.

sysadmin/LocalhostIsANetwork written at 01:48:52; Add Comment

2016-08-17

A surprising missing Unix command: waiting until a time is reached

I tweeted:

A surprising missing Unix command: 'sleep until it is time <X> or later'. Even GNU sleep only sleeps for a duration.

The obvious option here is at, but at has a number of drawbacks that make it far from an ideal experience. For now, I'll just note that it of course runs your commands non-interactively, and sometimes you need interactivity. Perhaps you're doing:

waituntil 17:50; rsync -a login@host:/some/thing .

(Which is in fact more or less one of the things that I wanted this for today.)

Original V7 Unix of course had a perfectly sensible reason not to bother implementing something like this, namely that you can easily put a reasonable version together by using some shell scripting to work out how many seconds to tell sleep to sleep. In the relatively minimal V7 environment, a dedicated command for waituntil would not have been quite out of place and not Unixy. Indeed, I wouldn't expect traditionally minded Unixes like the *BSDs to pick up such a program or option to sleep.

On the other hand, GNU coreutils is a different matter. The GNU people are perfectly happy to add features to traditional Unix commands (often quite handy ones), much to the despair of Unix traditionalists. They have certainly added some features to GNU sleep, like multiple arguments in multiple formats, so it wouldn't have surprised me at all if they'd added a 'sleep until absolute time <X>' option as well. But they haven't (so far) and as far as I could tell on a casual perusal of my Linux systems, no one else has written something like this and gotten it packaged so it's commonly installed.

Because sometimes I'm a system programmer as well as a sysadmin, I wound up writing my own version of a waituntil program. It's in Go because that seemed the right language and I felt like it.

(I picked Go because the hard part is parsing and manipulating the time argument, and Go actually has a quite nice and flexible system for that. This turned out somewhat more complicated than I expected, but these things happen.)

Sidebar: The right way to do this in the modern Unix API

My version of waituntil and any version that is based on top of sleep have a little problem: they don't necessarily cope with the system clock changing. If it's 16:10 now, you say 'waituntil 16:20', and a minute later the system clock jumps to 16:20, you probably want to have the wait finish right then because you've reached the target time.

As far as I can see, the correct way to do this on a modern standards-compliant Unix system is to use clock_nanosleep() with TIMER_ABSTIME to wait to an absolute time, and use CLOCK_REALTIME as the clock you're doing this against. Given this, you should be woken the moment the system's clock hits (or passes) the absolute time you specified, even if the clock is adjusted forward or backward in the mean time.

(Of course, support for this isn't necessarily there in all Unixes, at least right now.)

Unfortunately for me, Go doesn't expose clock_nanosleep() (except as a raw Linux syscall, and I don't feel like going there). For my personal use this is okay, but it would be nice to do better.

unix/WaitUntilOmission written at 23:02:47; Add Comment

2016-08-16

How you tell what signals a Linux process is ignoring

Suppose that you want to know what processes on your system are ignoring certain signals, such as the SIGTERM that systemd uses to try to get lingering processes to quit. How do you find this out? As with many process-related things in Linux, the answer is that you look in /proc.

Specifically, you look in /proc/<pid>/status and get the SigIgn: field. This is in hex, and may look something like this:

SigIgn: 0000000000004007

(There is also SigCgt: for the signals that the process has installed signal handlers for.)

This is a bitmap of signals. You can see the mapping between signal names and numbers with 'kill -L' (which reports them in decimal), and then use your favorite decimal+hex calculator to work out what bit this corresponds to. Suppose we want SIGTERM, signal 15. I'm a Python person, so:

$ python -c 'print "%x" % (1<<(15-1))'
4000

(We subtract one because signals are numbered starting from 1 instead of 0.)

In this case, the question about our SigIgn: above is very easy; this process is ignoring SIGTERM, but not any other signals around it.

If we want to look at a mass of processes, we can abuse gawk:

cd /proc
gawk '/^SigIgn:/ && (and(strtonum("0x" $2), 0x4000) > 0) {print FILENAME}' */status

Somewhat to my surprise, it turns out that there are quite a few programs on my Linux machine that are ignoring SIGTERM. But how many are ignoring both SIGTERM and SIGHUP (signal 1, and thus it has the mask 0x01)?

cd /proc
gawk '/^SigIgn:/ && (and(strtonum("0x" $2), 0x4001) == 0x4001) {print FILENAME}' */status

Once again there were more than I expected. However, I think I don't notice most of them because they exit if their connection to the X server goes away. The exception is my non-friend kio_http_cache_cleaner, which is apparently so disconnected from everything that it doesn't notice my X session disappearing.

(Perhaps it would if I was running a proper KDE session, or even a Gnome session. Also, now I wonder if I am somehow indirectly starting it in such a way that it ignores more signals than usual, since a number of things in my X session turn out to have the same set of ignored signals as it does.)

Sidebar: seeing what signals a Linux process is ignoring

On some systems, such as Solaris and its descendents, you have a handy psig program that will report all sorts of information about what signals a process is ignoring or catching and what it's doing with them and so on. On Linux, as far as I know, even a basic version is not part of eg procps-ng or similar packages. If you need this information, see eg Erik Weathers' psig, which I found via this popular Stackexchange question and its answers.

I'm probably going to wind up keeping a copy of psig around, as using it on various processes here has already given me things to think about and look into.

linux/WhatSignalsIgnored written at 23:31:55; Add Comment

2016-08-15

My ambivalent view on Vim superintelligence, contrasted with GNU Emacs

I have historically had an extremely ambivalent attitude on vim as a superintelligent editor; my standard remark is that if I want such an editor, that's what I have GNU Emacs for. Part of this is just that I often want vim to not get in the way, and part of this is that I'm used to vi being vi. But a good part of this comes down to what I see as a fundamental philosophical difference between the editors.

To put it simply, vi has a powerful conceptual model at its heart, while GNU Emacs already starts out being basically a big random ball of stuff. Sure, GNU Emacs has a certain amount of concepts, but at its core it's a mechanism for wiring keystrokes to Lisp functions. It's nice when the collection of keystrokes have some conceptual unity behind them (it makes them easier to remember), but there is no fundamental model of 'how GNU Emacs edits text' that makes that necessary in the way that it does for vi. You can have commands in vi(m) that really feel like they are breaking the rules; in GNU Emacs, the rules are just conventions and may be freely violated without pain.

(However much I love it, magit clearly breaks the 'rules' of GNU Emacs in a flagrant way by binding ordinary alphabetical characters to all sorts of special actions in its status buffer. In GNU Emacs, of course, this is perfectly fine and has a long history.)

The consequence of this is that because GNU Emacs is fundamentally relatively arbitrary, it's much easier to make it do random superintelligent things without damaging what conceptual integrity it has. And you have a clear hook in the core of the editor for doing those superintelligent things, because 'running Lisp code in response to keystrokes' is what GNU Emacs is all about.

The other advantage that GNU Emacs has is that, to put it one way, it's Lisp all the way down (even if some of the Lisp is in C). At a conceptual level and often at a practical level, GNU Emacs is written in the same language as your superintelligent extensions to it, and your extensions do the same kind of things as everything except the very lowest level code (often through exactly the same interfaces). There is no privileged (language) core to GNU Emacs.

Neither of these are true for Vim. Instead, vim has a hard core of both a conceptual model and an implementation environment, and then it has a plugin model that is attached on the side instead of being a fundamental component. In Emacs, auto-indentation is conceptually simple; you rebind newline to run your Lisp code that analyzes the state of the text buffer and then inserts the right number of spaces (and maybe tabs) as well as that newline. The only difference from plain newline processing is that you're running a lot more Lisp code and you're not inserting just a single character. In vim, well, you need an entirely new conceptual model of hooking into keystrokes, and then what happens is a completely different path from if you didn't have smart auto-indentation active. And this change raises other relatively deep questions; for example, you can normally repeat insertions with .. What happens if you . an insertion with a newline in an environment with auto-indent? Does it insert the actual end result text, or does it basically re-run the user's typing and thus re-trigger auto-indentation in the new context?

You can answer these questions (and you have to), and presumably vim has. But that these questions get raised when you start adding various forms of superintelligence to vim is why I feel ambivalent about the whole endeavour. On the one hand, I see the appeal and the necessity. On the other hand, it feels messy in a way that base vi(m) doesn't.

(I sort of touched on this in Where vi runs into its limits, but the whole issue is on my mind again for reasons beyond the scope of this entry.)

(I optimistically think it might be possible to create a vi-like 'composable operations' editor with a strong conceptual model that gracefully allowed for extensions, auto-indentation intelligence, and so on. But I have no idea what the result would look like, and it might not look entirely like vi. Perhaps it shouldn't permit arbitrary extensions to its functionality and instead have a clear model for, say, parsing buffer contents and expressing auto-indentation rules based on the parse state.)

unix/VimSmartsVsGNUEmacs written at 22:42:50; Add Comment

Some options for reindenting (some of) my existing Python code

In my entry on how I'm probably going to change my style of indenting Python code, I said that it was at least a bit tempting to reformat existing code to be PEP-8 compatible (probably using the 'if I have to touch it at all, it gets reindented' approach). This leaves me with the question of how to do that.

In an ideal world I could simply load code into GNU Emacs and say 'okay, my indent level is going from 8 spaces to 4 spaces, lay out everything again with this in mind'. Unfortunately I don't think Emacs's Python mode offers this feature (and it's actually slightly harder than it looks). The one time I tried to sort of fake this with Emacs features that I could find, it blew up in my face in a painful way.

(I'm not familiar with vim's Python smarts, but maybe it has something to do this.)

The great big hammer is yapf. This will do the reindentation, but with the side effect of reformatting all of my code according to its idea of proper style. Having looked at its output, I don't entirely agree with it and I feel that it shuffles code and comments around more than I'm happy with. I could give in (for the same reason I gave in with gofmt), but I'm also not convinced that yapf is going to become the one true style that people write to.

Yapf compares itself to autopep8 and pep8ify. It looks like at least pep8ify can be told to only change indentation and not touch anything else, so perhaps that is the easiest approach. On the other hand, I might find that one or the other (or both) of these make some additional formatting changes that I'm okay with. I'd have to try them.

There's also a reindent Python package, and a version of this appears to be part of the tools that are packaged with Python 3. This says it's specifically about reindenting code and almost nothing more, which is just what I want. If the PyPi version still works in Python 2, this is probably a good thing to check out first.

Finally, there is the yak shaving option of writing my own version of this. On the one hand, it doesn't seem too complicated and in theory it would do exactly what I want. On the other hand, there are all sorts of corner cases with continued lines and so on, and if there are existing tools that do it right (or right enough) I should probably just use one of them instead of indulging my vague instinct to write some code.

(If I was all fired up to write this sort of code it might be different, but I'm not. 'Write my own' is just a vaguely grumpy reflex.)

(As usual, writing this entry has caused me to do enough research that I now know that I have a lot more options than I thought I did.)

python/ReindentationOptions written at 00:39:41; Add Comment

2016-08-14

Code alone can tell you the what but it cannot tell you why

It all started when John Arundel tweeted:

@hlship: I completely don't buy into "code is self documenting". Code needs docs to explain the why; code by itself is only how.

I think code should be so clear, simple, and straightforward as to need the fewest possible comments, ideally zero.

and then, in a followup tweet:

I'm not saying you shouldn't comment your code. I'm saying you should code so that you don't need to.

I very strongly believe that this is impossible. As @hlship says, code can say what it is doing but it cannot by itself tell you why it is necessary to do that thing, or why you don't want to do that thing in another way, and so on. To write code that does communicate that information, you must effectively embed documentation in the form of names for things (and then you must hope that everything is sufficiently clear to convey your meaning).

Here, let me give a concrete example. My Go SMTP server package contains the following little snippet at the start of the function that parses SMTP commands:

if !isall7bit([]byte(line)) {
   res.Err = "command contains non 7-bit ASCII"
   return res
}

I think that the what and how of this code is reasonably clear and doesn't need any comments on what it is doing. But the why is completely opaque. Are we rejecting lines with non-ASCII characters because we are being RFC-picky? Could we take this code out in order to have a SMTPUTF8 compatible server? If we wanted a SMTPUTF8 compatible server, what other changes would be required, if any?

As it happens this snippet has an important 'why' attached to it. My comment in the actual source is not clear, but the reason for this check is that I later convert the entire line to upper case in order to make matching SMTP commands easier, and then use indexes into the upper-case version of the line to extract things from the original version of the line. Go considers all strings to be UTF-8 by default, so case conversion is done in Unicode, and Unicode case conversion can change how many Unicode characters a string has. When my code use indexes from the upper case string with the original string, it implicitly assumes that this doesn't happen.

(I also care about RFC compliance, which is a secondary reason.)

Could you write code that did something similar to this check and was clear about the why? Perhaps. But I think it would require either weird function names or structuring the code differently, for example by upper-casing the line and then insisting that it had the same length as the original version.

(The other option is to completely restructure the command matching code so that it works in a different way and doesn't care about this. Would that be better? Maybe. You might still want to be RFC picky here, instead of implicitly supporting SMTPUTF8.)

Would such code be 'simple' and 'straightforward'? I suspect not, although simplicity is at least partly in the eyes of the beholder. It would certainly have taken me longer to write than the current approach.

(None of this is new, and it's quite similar to what I've written about writing comments in your configuration settings, and there's documenting why you don't do things, and how procedures are not documentation and undoubtedly others.)

programming/CodeCommentsWhy written at 00:26:34; Add Comment

2016-08-13

What I did to set up a wireless network and what I have left to do

I tweeted:

I didn't entirely plan on setting up a home wireless network tonight, but sometimes these things happen.

Up until now, what I have had at home is my home Linux machine and a DSL modem. I run PPPoE on my home machine and it sits directly on the Internet. This is an unusual setup, but it's all that I needed and I don't really like being behind NAT gateways and so on. In the beginning my DSL modem was just a DSL modem, but when I upgraded to VSDL I needed a new VDSL 'modem'. Well, you can't get things that are just VDSL modems (or not easily); instead you get multi-port VDSL routers with wireless capability that you can force to dumb themselves down to just being (V)DSL modems.

(I have this one, on the recommendation of my ISP, and it works. I can even SSH in to it to get interesting stats, which is occasionally useful.)

Recently (as in today) I somewhat impulsively picked up a gadget that not merely likes to use your wifi network but absolutely insists on doing so in order to get some important-to-me functionality. So I needed a home wireless network of some sort, ideally with as little disturbance to my current home machine's setup as possible. In the end this turned out to be less involved than I was worried about, although my current setup is an incomplete hack.

What I did:

  • Turned on the wireless access point functionality of my VDSL modem. I was worried that the modem might insist on the wireless being a captive network or running DHCP or some such thing, but it's happy to be just a basic access point that bridges wireless and wired traffic together. I changed the SSID to something nondescript and set a very non-default password (although I carefully picked one that would be reasonably easy to enter on constrained devices).

  • My wireless devices were going to need DNS, so I made my Unbound setup listen on the local network interface as well as localhost. I also added a local. zone with DNS entries for everything I expected.

  • I installed the Fedora dhcp-server package and configured it to do DHCP on my local network, where by 'configured it' I mean 'I copied bits and pieces out of the DHCP configurations we use at work'. As a cautious sysadmin I don't believe in letting just anyone use my wireless network, so I mostly plan to stick to static IP address assignment for known wireless MACs. To make life simpler for myself, I set up a very small dynamic pool with a low lease time; this way I can get a device's MAC by watching for new things in this pool, and the device itself will be pacified about 'having wifi' (since its DHCP requests will get answered, even if it can't do anything).

This left the small detail of outside connectivity, which means NAT. There are lots of guides for this on the Internet and I cribbed bits from several of them:

  • enabled IP forwarding on my local network interface and my external PPPoE interface. Failure to remember the latter caused my initial setup to fail; it took a while before the penny dropped. Some people enable IP forwarding globally, but I want to be more selective for hand-waving reasons. See my entry on IP forwarding settings for an explanation of why I needed both.

  • enable general iptables forwarding for established connections from my PPPoE interface to my local one:

    iptables -A FORWARD -i ppp0 -o enp7s0 -m state --state RELATED,ESTABLISHED -j ACCEPT
    

    Since my current FORWARD policy is a default-accept, I'm not sure this is currently required, but probably I should block other traffic from making this hop.

    (I think it would require something perverse that shouldn't happen to send traffic for a RFC 1918 IP address down my PPPoE link, but the Internet environment is full of perverse things.)

  • enable forwarding from the local interface to my PPPoE interface for the specific IPs that DHCP assigns to the wireless devices that I know about:

    iptables -A FORWARD -i enp7s0 -o ppp0 -s <IP> -j ACCEPT
    

    Again this is probably partly surplus currently because of the default accept state.

  • enable SNAT for the specific IPs of known wireless devices:

    iptables -t nat -A POSTROUTING -s <internal-IP> -J SNAT --to-source <my-public-IP>
    

    Since there is no broad default SNAT, this stops any other device on the local network from being able to do anything much on the Internet.

So what do I have left to do? Quite a lot, actually. First, I need to set things up so that this persists over a reboot. I was in a big hurry this evening so I just did all the commands by hand (since my goal was to get the gadget going and make sure it worked; building a home wireless network was just a prerequisite).

Next, I need to make it so that other IPs on the internal network are genuinely blocked from doing anything, as opposed to simply not working by side effect. This will need at least some catch-all entries to explicitly block all other local traffic going to the PPPoE link and probably from the PPPoE link to the local network. I should definitely write the iptables rules necessary to make sure that I don't leak un-NAT'd local network addresses onto the Internet.

More subtly, I was lazy when I picked the local network I was using. My VDSL modem's administrative IP lives on 192.168.1.0/24 by default, so I had enp7s0 set up with this network. However, there is no reason for the wireless network to be in the same subnet as the VDSL's administrative IP (even if they are sort of on the same wire). I should move the whole network to another subnet, say 192.168.2.0/24.

Right now I'm writing separate rules for each known good IP, but this is an obvious case for ipsets, especially if I expect to have more and more wireless devices show up. Maybe someday I'll take my work laptop home temporarily, for example. I think the interaction between ipsets and the other rules should be straightforward, but I'll get to find out once I have the energy.

My home machine has a whole complex policy based routing setup. At the moment this doesn't interact with the wireless NAT'ing at all, with the effect that the wireless devices can't talk with anything reached over my IPSec tunnel because their traffic goes out un-NAT'd and then the other end just throws it away because there's no route back to 192.168.1.* addresses. What I probably want here is something akin to isolated interfaces, where my IPSec tunnel to work just doesn't exist as far as wireless clients are concerned and all their traffic goes out my home machine's external IP regardless of where it's going.

(I don't feel confident and trusting enough in the security of my wireless network to let clever people on it have privileged access to work. Sure, I have a WPA2 key set, but my understanding is that those can be cracked.)

And of course all of this currently only supports IPv4. My home machine has IPv6 connectivity, so in theory I could offer IPv6 connectivity through to wireless clients. This probably shouldn't need any sort of NAT, but it would require me to figure out IPv6 DHCP and routing (and firewall rules, since I have no intention of allowing my wireless devices to be fully exposed to the Internet).

linux/QuickWirelessNetworkSetup written at 01:10:23; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.