Wandering Thoughts: Recent Entries

Categories: links, linux, programming, python, snark, solaris, spam, sysadmin, tech, unix, web.

2013-05-20

A serious potential danger with Exim host lists in ACLs

Suppose that you have an Exim installation and you want to support some sort of source host based blocking (selective or otherwise) of incoming connections. The obvious way is to create an ACL section that looks something like this:

deny
    domains = +local_domains
    hosts   = ${if exists {UBLOCKDIR/hosts} {UBLOCKDIR/hosts}}
    message = mail from host $sender_host_address not accepted by <$local_part@$domain>.
    log_message = blocked by personal hosts blacklist.

(This one is a selective, per destination address host block list, hence the fun and games with UBLOCKDIR.)

This looks great and generally works but you've just armed a ticking time bomb, one that can blow your incoming email up with permanent temporary deferrals. The first problem is that Exim has no way in a host list to say 'this domain and any of its subdomains', in the way that the TCP wrappers '.host.com' will match both 'host.com' and 'fred.host.com'. If you want to match this case, the obvious way is to write two entries:

*.host.com
host.com

The first matches any subdomains of the domain; the second matches the domain itself. But you've just put the fuse in the bomb, because of just how plain host and domain names work in host lists. From the Exim specification with the emphasis being mine:

  • If the pattern is a plain domain name [...] Exim calls the operating system function to find the associated IP address(es). [...]

    If there is a temporary problem (such as a DNS timeout) with the host name lookup, a temporary error occurs. For example, if the list is being used in an ACL condition, the ACL gives a "defer" response, usually leading to a temporary SMTP error code.

So here's what happens. You list '*.spammer.com' and 'spammer.com' in your blocklist. Spammer.com turns off their DNS (or their DNS server turns it off because hey, they're a spammer) but doesn't de-register their domain, so DNS queries to their nominal authoritative DNS servers either don't get answers or get non-authoritative 'look elsewhere' results. Although this is a permanent condition, it's considered a temporary failure in DNS resolution. Exim now defers all SMTP connections that consult this host blocklist, regardless of where they are from. For ever, or at least until you notice.

Now that I've read the Exim documentation in detail, it spells out that you can turn this behavior off with the special option +ignore_defer. You probably want to do this. Certainly we do.

My feeling is that you want to do this for every host list anywhere except ones used for real, strong access control (which probably don't want to be using DNS names anyways). Consider, for example, a host list used for exceptions to greylisting; you probably don't want that ACL to defer the connection if you can't resolve a domain in it.

Sidebar: the other surprise in Exim host lists

Suppose that you have a host list like this:

*.spammer.com
192.168.0.0/16

Surprise: any connection from a host in 192.168/16 that does not have valid reverse DNS will not match the list. The moment you list a hostname wildcard in a host list, any IP address without a hostname automatically fails to match that entry or anything later in the list (or file if the list is in a file). It will match IP address patterns that are earlier in the list, though, so you get to remember to list all IP address patterns first. This behavior is documented if you read the documentation carefully.

Per the fine documentation this behavior can be turned off with +ignore_unknown. Now that I've found this, I need to make some configuration changes.

This is generally less dangerous than the host list defer time bomb, but it depends on what you're using the host list for. If you have a locked down configuration where you're using the host list for strong access control, well, you have potential issues here.

sysadmin/EximHostsListDanger written at 23:43:44; Add Comment

2013-05-19

Today's comment spammer trick: regurgitated comments

I log the contents of some attempted spam comments here on Wandering Thoughts (the concise summary of when is when the spammer seems to be trying hard). Usually this doesn't get anything, but today my trawl through the logs turned up a succession of bizarre and odd comment attempts. The text had misspellings and typos but it generally made sense and most of the comment attempts were even about technical things that are vaguely on topic for here. But they were invariably attempts to comment on very inapplicable entries.

When I looked at the logs in detail, one of the most striking was a series of comment attempts that looked very much like a conversation between two or more people about using git on home directories. This was very odd since none of the comments were being posted, yet the people were pretty clearly replying to each other; I began to develop all sorts of theories about disturbingly intelligent content auto-generation. Finally I noticed something in one of the comment texts and the penny dropped:

[...] Possibly related posts: (automatically generated)Heroku, the Rails app.

There is a really simple way to get this text into a spam comment: you can be scraping content from existing blog posts and/or blog comments. So my new theory is that the would-be comment spammer is is scraping comment text from other blogs, mangling them somewhat, and then spam-posting them on other blogs (including mine).

The mangled text doesn't seem to have any links or other spam-relevant text so I'm not sure why the spammers are doing this. Maybe they're fishing to see what blogs will allow their comments through moderation and will follow up with more active content on blogs where this works.

Sidebar: source details and other things

So far 30 different IP addresses have tried this here today; most IP addresses have made only one attempt each. The IP addresses cover a large range of source networks. A few of them are CBL listed but that's pretty much it as far as DNBLs are concerned. Four of the IP addresses actually belong to Microsoft (168.63.43.185, 168.63.62.182, 168.63.76.184, and 168.63.84.217; all four are currently listed on the CBL). I'm assuming that these are compromised machines, VPS servers, or both.

Many of the IP addresses also made a burst of GET requests for various other URLs here. Maybe they're scraping text from Wandering Thoughts for use in their corpus for their next spam run somewhere else.

spam/RegurgitatedCommentSpam written at 22:45:29; Add Comment

The technical effects of being an out of tree Linux kernel module

Suppose that you have a kernel module that is not in the mainstream kernel source for one reason or another. Perhaps it is license compatible but just not integrated for various reasons (as is the case with IET) or perhaps it is license incompatible (as is the case with ZFS on Linux). This non-inclusion has a number of cultural effects, but it also has real technical effects. Although I've mentioned them before, today I want to talk about them in some detail.

The first thing to know is that the Linux kernel does not have a stable kernel API for modules; how a module interacts with the rest of the kernel can and will change without notice. When your module is part of the kernel source, changing it to cope with the API change is generally the responsibility of the kernel developer who wants to make the API change. When your module is not in the kernel tree, not only is changing its code your job but so is even knowing about the API change. And API changes are not always obvious because sometimes they're things like changes in locking requirements or how you are supposed to use existing functions.

(Sometimes they are semi-obvious, like changing just what arguments a function takes. You do pay attention to all warning messages that show up when building your kernel module, right?)

Any number of people would like this to change but it isn't going to. The Linux kernel development process is optimized for in-tree code and not for out of tree code. If your out of tree code cannot be included in the kernel for various reasons, that's tough luck but the kernel developers really don't care that much (as a general rule). Locking themselves down to any stable module API would reduce their ability to improve and evolve the kernel code.

The next effect is pragmatic: if your code is not in the kernel tree, almost no one will look at it (and this includes automated scans over the kernel source code that look for various things) or do things to it. This is great if you're possessive about your code but it means that you're missing out on the quality checking that this creates, all of the little janitorial cleanups that people do, and if there is a bug then your module's developers are the only people who are looking at it.

(In some quarters it's fashionable to think that the Linux kernel developers are all clowns and cannot possibly contribute anything worthwhile to your code. This is a major mistake. Among other things they're basically certain to know the overall Linux kernel environment better than you do.)

A related issue is that the kernel developers try not to create bugs and regressions in in-tree code, especially if it's considered important (which, say, a commonly used filesystem will be); if one is created anyways a bunch of people will go looking to try to fix it. It's almost certain that no official kernel release would go out that broke a significant filesystem; the change that created the breakage would be identified and then reverted, with the change's developer told to try again. If your module is not in the tree, well, you're on your own. Performance regressions or actual breakages are your problem to diagnose and then either fix or try to argue the kernel developers into changing their side of the problem.

(And they may not, especially if your code is license-incompatible with the kernel and most especially if their change actually improves in-tree code and performance and so on.)

All of this means an out of tree kernel module requires more ongoing development work than an in-tree kernel module. In-tree kernel modules generally get somewhat of a ride from general kernel developers; out of tree modules do not and have to make up for it with time from their own developers. One predictable result is that many out of tree modules don't necessarily support all kernel versions, including kernel versions that sysadmins may want to use. A worst case situation with out of tree modules is that the developers simply stop updating the module for new kernels; any users of the module are then orphaned on old kernels.

linux/TechnicalNonGPLKernelModules written at 01:19:52; Add Comment

2013-05-18

A little habit of our documentation: how we write logins

Ove the years, we've developed a number of local conventions for our local documentation. One of them is that we always write Unix logins with < and > around them, as if they were local email addresses, so that we'll talk about how <cks>'s processes had to be terminated or whatever. When I started here this struck me as vaguely goofy; over time it has rather grown on me and I now think it's a quite clever idea.

Writing logins this way does two things. The first is that they become completely unambiguous. This is not much of an issue with a login like 'cks', but we have any number of logins that are (or could be) people's first or last names, and vice versa. Consistently writing the login with <> around it removes that ambiguity and uncertainty. The second thing it does is that it makes it much easier to search for a particular login in old messages and documentation. Searching for 'chris' may get all sorts of hits that are not actually talking about the login chris; searching for '<chris>' narrows that down a lot.

(Well, sort of. The reality is that we sometimes wind up quoting various sorts of system messages and system logs in our messages and of course these messages generally don't use the '<login>' form. However, often excluding these messages from a later search is good enough because we're mostly interested in the record of active things we did to an account.)

There's a corollary to the convenience of <login>: right now we have no similar notation convention for Unix groups. We write less about Unix groups than about Unix logins (and groups generally have more distinct names), but it would still be nice to have some convention so we could do unambiguous searches and so on.

sysadmin/UsernamesInDocumentation written at 01:13:38; Add Comment

2013-05-17

Why I'm not considering btrfs for our future fileservers just yet

In a comment on yesterday's entry I was asked:

Could you elaborate on the "btrfs does not qualify" part?

What's missing? How likely do you think this to change in the near future?

I will give a simple looking answer that conceals big depths: what's missing is a btrfs webpage that doesn't say 'run the latest kernel.org kernel' and a Fedora release that doesn't say 'btrfs is still experimental and is included as a technology preview' (which is what Fedora 18 says). It's possible that btrfs is more mature and ready than I think it is, but if so the btrfs people are doing a terrible job of publicizing this. Fundamentally I want to be using something that the developers consider 'mature' or at least 'ready' and I don't want us to be among the first pioneers with a production deployment of decent size in a challenging environment.

Pragmatically there is nothing that btrfs can do to make us consider it in the near future, for reasons I wrote about two years ago in an entry on the timing of production btrfs deployments. If btrfs magically became perfect tomorrow, it would only appear in an Ubuntu LTS release in 2014 and an Red Hat Enterprise release in, well, who knows but probably not this year.

(The current Ubuntu 12.04 LTS has btrfs v3.2, whereas btrfs is up to v3.9 already. The btrfs changelog shows the scope of a year's evolution.)

As far as what in specific is missing, well, I have to confess that I haven't looked at the current state of btrfs in much detail and so I don't have specific answers. I poke at btrfs vaguely every so often; generally I discover something that strikes me as alarming and then I go away again. Since btrfs is never going to be exactly like ZFS, I can't just directly translate our our ZFS fileserver design to btrfs and then complain about what's missing or different. To have a really informed opinion on what btrfs needed and what was wrong with it, I'd have to do a btrfs-based fileserver design from scratch, trying to harmonize what we think we want (which has been shaped by what ZFS gives us) with what btrfs gives us. So far there seems to be no real point to doing that before btrfs stabilizes.

(I'm starting to think that btrfs and ZFS have fundamentally different visions about some things, but that needs some more reading and another entry.)

Sidebar: ZFS on Linux maturity versus btrfs maturity

You might ask why I'm willing to consider ZFS on Linux even though it's a relatively young project, just like btrfs. The answer is that the two are fundamentally different. The ZFS part of ZoL on Linux is generally a mature and well proven codebase; most of the uncertain new bits are just for fitting it into Linux.

linux/BtrfsWhyNotYet written at 01:29:56; Add Comment

2013-05-16

Why ZFS's CDDL license matters for ZFS on Linux

In a G+ conversation about ZFS I read the following:

[...] so, why use BTRFS at all? :-) Just the fact that it's GPL (and so able to be embedded into the kernel source tree) doesn't seem enough, specially considering that CDDL (the ZFS license) is a bona fide open source license, [...]

On the whole I like ZFS on Linux, but let's not mince words here: this licensing issue is a big issue. Were btrfs and ZFS close to general parity, it would be a very strong push towards btrfs.

That ZFS is CDDL licensed means that it can never be included in the Linux kernel source. It may mean that it can't be prepackaged in binary form by distributions, or at least by distributions that care strongly about licensing issues. The CDDL is part of what makes it extremely unlikely that Red Hat Enterprise or Ubuntu LTS will ever officially support ZoL, making it always be a 'batteries not included, you get to integrate it' portion of the system.

That ZFS will not be included in the Linux kernel source (because of the CDDL among other reasons) means that you are more at risk of developers ceasing to update ZFS for newer kernels (among other less important effects).

(Being in the Linux kernel source is no guarantee that code will be maintained, but it increases the chances a fair bit.)

These are risks that we'd be willing and able to take on, so they aren't real obstacles for us using ZoL if that turns out to be the best option for new fileservers. But they still weigh on my mind and there are any number of places where they are going to be real issues, sometimes killer ones.

(I've written about this before.)

(Given the current situation with 4k disks, we're already looking at recreating pools when we move them to a new fileserver infrastructure. At that point we could just as easily migrate from ZFS to something else, if the something else was good enough. Btrfs currently does not qualify.)

linux/ZFSWhyCDDLMatters written at 01:16:45; Add Comment

2013-05-15

Why I've so far been neglecting functional programming languages

Functional programming languages are in many ways the latest hotness and so for years I've been making off and on runs at things like yet another explanation of monads (which I think I sort of understand by now) and similar topics. Despite this, so far I've been almost completely uninterested in actually trying to write a functional program or exploring a FP language.

The big problem for me is that as far as I can tell, the kind of programs I usually work with are exactly the kind of programs that functional programming is stereotypically a bad fit with. The stereotype I've absorbed is that functional programming is quite a good fit for computation but not a good fit for IO, because IO intrinsically has side effects. Unfortunately most of what I write is all about IO and has little or no computation. Bashing a squarish peg into a roundish hole is unlikely to tell me anything particularly meaningful about nice the language is to work in; what I really need is a roundish peg, a computational problem, and those are relatively scarce around here.

(It's possible that I'm not looking hard enough. For example, I do periodically want to do things like log analysis or event reassembly, where the original data could just as well be a predefined data structure in the program instead of processed from logfiles on disk. I suspect that a functional language would handle these fine, maybe better than ad-hoc hackery in awk, Python, or whatever. If I was really crazy I would try rewriting the logic in our ZFS spares handling system in an FP language to see if it got clearer; it's fundamentally a series of transformations of a tree and then some analysis of the result. The result might even be more testable.)

programming/WhyNotFunctional written at 00:56:36; Add Comment

2013-05-13

My language irritations with Go (so far) and why I'm wrong about them

The great thing about an evolving language is that if you're slow enough about writing up your irritations with it, some of them can wind up fixed (or part fixed). So this list is somewhat shorter than it was when I originally wrote my first Go program, and none of the irritations are major. Also, I will reluctantly concede that Go has good engineering reasons for all of them.

My largest single irritation is that break acts on switch and select; I expected it to act only on any enclosing control structure, so that you could write something like:

for {
   select {
   case <-mchan:
      // message silently swallowed
   case <-schan:
      break
}     

Instead you have to invent a boolean loop condition. I understand why Go does this; it enables you to exit early out of a switch or select case instead of having to wrap everything in ever increasing levels of nesting. This is likely especially important because Go uses explicit error checking (which would otherwise force those nested if blocks).

The issue that got partially fixed is Go's return requirements. When I wrote the original version of my program the natural form of one function was a big switch with a number of specific cases and then a default: to catch the rest; however, the original rules required a surplus return at the end of the function, which irritated me by forcing me to move the default case to the end of the function, obscuring the logic. The Go 1.1 changes make my particular case okay but I believe there remain cases where you need an unreachable ending return (or panic) to make the compiler happy.

You can make an argument that the original and current state of affairs are good software engineering. If the compiler did true reachability analysis it'd increase the number of cases where an innocent looking change to some part of the code would suddenly make the return coverage not be complete and thus produce potentially odd messages about missing returns. The current brute force rules protect against this and lead Go programmers to write in a certain sort of consistent style.

My final issue is my perennial one of being unable to cleanly cancel IO being done by goroutines, breaking them out of things so that they can see a death signal from outside. You can argue that this is a bug in the runtime, but the problem with this is that everything that calls an IO operation then needs to be aware of this particular error case (and catch it, and propagate it up the call stack in whatever way is appropriate). A good start to making it a bug in the runtime would be for the runtime to define a specific error for 'IO attempted on closed connection' and for absolutely everything to use it.

(As it stands, the net package doesn't even define a publicly visible error instance for this case, although it does define one internally. It's my personal view that this beautifully illustrates why this is a general language problem; while you can 'solve' it in code, it requires absolutely everyone to get it right and, well, they clearly don't.)

Again this is a software engineering tradeoff. Both the semantics and the runtime implementation of goroutines are undoubtedly vastly simplified because you don't have to worry about being able to signal or cancel a goroutine from outside itself. Outside of the program exiting, all of the interaction that a goroutine has with the outside world are initiated by itself, on its own terms. This makes it much easier to reason about the effects of a goroutine, especially if it's careful not to use global state.

programming/GoLanguageIrritations written at 23:39:13; Add Comment

The Unix philosophy is not an end to itself

Today I feel like opening a can of worms that I've alluded to before.

Here is something very important about the Unix philosophy (regardless of what exactly that is): the Unix philosophy was not conceived as an empty philosophy that was an end to itself. Instead it is above all a theory about how to make computers easy, powerful, and useful. This philosophy (or at least the things built by people following it at Bell Labs and elsewhere) has been extraordinarily successful, and I'm not just talking about Unix; concepts first pioneered in Unix and C now form core pieces of pretty much every computer system in the world.

But it's possible to take this too far. To put it one way, it's my strong view that the core goal of Unix is to be useful, not to be philosophically pure. The underlying purpose comes first and fitting how to be useful into 'the Unix way of doing things' comes second. If Unix has to be non-Unixy for a while (or even permanently) in order to be useful, then, well, I pick usefulness. Excessive minimalism and 'Unixness' for the sake of minimalism and Unixness is a kind of masochism.

(Of course the devil is in the details, as it always is. It's certainly possible to ruin Unix without getting anything worth it in exchange.)

What this biases me towards is an environment where one solves the problem first then try to make it fit into the traditional 'Unix way' second. Which is why part of me thinks that GNU sort's -h option is perfectly fine because it solves a real problem (and solves it now).

(The counterargument is that Unix cannot be all things to all people. As with all systems, at some point you have to draw a line and say 'this doesn't fit, you need to go elsewhere'. I don't know how to balance this. I do know that a certain amount of griping about 'the one true Unix way' and how (some) modern Unixes are ruining it reminds me an awful lot of the griping of Lisp adherents at the rise of Unix, and for that matter the griping of Unix people (myself sometimes included) at the rise of Windows and Macs.)

unix/UnixPhilosophyPurpose written at 00:29:34; Add Comment

2013-05-11

The consequences of importing a module twice

Back when I wrote about Python's relative import problem, I mentioned that only actually importing a module once can be important due to Python's semantics. Today I feel like discussing what these are and how much they can matter.

The straightforward thing that goes wrong if you manage to import a module twice (under two different names) is that any code in the module gets run twice, not once. Modules that run active code on import assume that this code is only going to be run once; running it again may result in various sorts of malfunctions.

At one level, modules that run code on import are relatively rare because people understand it's bad form for a simple import to have big side effects. At another level, various frameworks like Django effectively run code on module import in order to handle things like setting up models and view forms and so on; it's just that this code isn't directly visible in your module because it's hiding in framework metaclasses. But this issue is a signpost to the really big thing: function and class definitions are executable statements that are run at import time. The net effect is that when you import a module a second time the new import has a completely distinct set of functions, classes, exceptions, sentinel objects, and so on. They look identical to the versions from the first import but as far as Python is concerned they are completely distinct; fred.MyCls is not the same thing as mymod.fred.MyCls.

(This is the same effect that you get when you use reload() on a module.)

However, my guess is that this generally won't matter. Most Python code uses duck typing and the two distinct classes are identical as far as that goes. Use of things like specific exceptions, sentinel values, and imported classes is probably going to be confined to the modules that directly imported the dual-imported module and thus mostly hidden from the outside world (for example, it's usually considered bad manners to leak exceptions from a module that you imported into the outside world). In many cases even the objects from the imported module are going to be significantly confined to the importing module.

(One potentially bad thing is that if the module has an internal cache of some sort, you will get two copies of the cache and thus perhaps twice the memory use.)

python/DualImportProblems written at 22:16:08; Add Comment

These are my WanderingThoughts
(About the blog)

GettingAround
Full index of entries
Recent comments

This is part of CSpace, and is written by ChrisSiebenmann.
Twitter: @thatcks

* * *

Atom feeds are available; see the bottom of most pages.

This is a DWiki.
(Help)

Categories: links, linux, programming, python, snark, solaris, spam, sysadmin, tech, unix, web

Search:
(Previous 10 or go back to May 2013 at 2013/05/10)

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.