narf: Usenet spam filtering for CNews

Narf is a program that allows you to despam incoming Usenet newsfeeds under CNews, rejecting spam before it reaches relaynews and is filed into the news spool (this is the right place to do it). Narf runs as a standalone daemon and requires no changes to the core of CNews. Narf is designed to let you use any of the various perl spam filters for INN, although it works best with our version of cleanfeed-inn, and is thus almost entirely policy free; the narf program itself is a driver for passing articles through the filter of your choice.

Narf works only on uncompressed CNews batches, typically written by NNTP daemons and local postings. It should not be too difficult to make it work on the compressed batches normally created by UUCP feeds, but we don't have any of those left.

The current version of narf is 0.93.01 (about recent changes). Our filter was last updated December 9th 1997.
We consider both to be beta releases. Although they work for us and we think we have made narf reasonably generic, they have never been run elsewhere.

Although we cannot guarantee that narf will not cause your computer to take over the world, we use the software ourselves and would appreciate bug reports or improvements.

Basic requirements

Narf is written in perl 5.004. It runs as a daemon and requires a certain amount of real memory on your news server; on our NetBSD Alpha a generously configured narf usually takes up around 22 megabytes of virtual memory. Installation and operation requires some familiarity with CNews and perl, and a minor configuration change to your NNTP daemon software.

Despite the memory (and some CPU) requirements, narf will probably make your news server perform better. This apparent paradox is explained here.

How it works

Narf operates by reading uncompressed text batches in one directory, passing each article through your antispam filter, and writing out accepted articles to new batches in another directory. Because it keeps various context information in memory, it does not exit after a batch run but instead goes to sleep, waiting for more batches to arrive. In normal use, narf's destination batch directory is your normal CNews incoming news directory and its source batch directory is a new directory you have reconfigured your NNTP daemon to write batches to. Narf runs independantly of both your NNTP daemon and your periodic CNews proccessing such as newsrun.

Because this copying is invisible to both the NNTP daemon and to the core of CNews itself, narf requires no changes to either beyond changing where your NNTP daemon writes incoming batches.

Using narf

In order to use narf on your systems you should first read the installation instructions and then the operation instructions. If you use our filter you should read our discussion on our filter's decisions. A copy of narf itself is here and a copy of the filter we use is here, both in plaintext, so you can read them now if desired.

Performance

The basic summary of narf's performance is more than fast enough. Narf easily outpaces the rest of CNews, processing news batches far faster than the NNTP daemon can generate them or relaynews can file articles into the spool. Narf is usually CPU bound unless there is not enough real memory and it has to start swapping.

On our machine (a 274MHz 21064A EB64+ PCI-based Alpha system with 64 megabytes of memory running NetBSD) narf can process roughly 21,100 articles (136 megabytes total in 298 batches) through our expensive filter in ten minutes. That works out to about 35 articles and 233 kilobytes a second, which is about eight times as fast as our relaynews can process articles. As you can see, narf is unlikely to be a bottleneck for us any time soon.

Caution: cancels

Narf is not entirely policy free. As shipped it contains the policy of rejecting as many cancel messages as possible; in particular, it is configured to reject cancels for spam messages it has already rejected where this is safe to do. The determination of safety is designed for our filter and for cleanfeed-inn and may itself not be safe for other filters. This can be adjusted by making a small change to narf's internal configuration variables.

Narf considers it safe to reject the cancel for a rejected article when it can be sure that the rejected article would still be rejected if presented to it again after a cold start. This means that it does not reject cancels for new spam that the filter has learned, because a cold restart might lose this recognition. A further discussion of the issue is in comments in the narf source code.

Further information

Narf is part of our Usenet despamming software, itself a part of our policy of making our anti-spam software and configuration files available. You might also be interested in our daily narf reports.



This page and much of our precautions are maintained by Chris Siebenmann, who hates junk email and other spam.