PORTNANNY CONFIGURATION

 Portnanny is configured through three files: the configuration file,
the rules file, and the actions file. The configuration file is read
once on startup to define things such as what the names of the rules
and actions files are; the rules and actions files are reloaded every
time they change.

 All files allow comment lines, blank lines, and continued lines.  A
comment line is any line that has '#' as the first non-whitespace
character; eg
	# this is a comment
is a comment line. Comment lines are always ignored completely.
Lines in all three files may be continued simply by indenting the
second and subsequent lines with whitespace; eg, ignoring the
leading tab:
	this will
		be
	 one logical line.
turns into 'this will be one logical line.'. It is illegal to start
the first non-comment line in a file with whitespace, as it would be a
continuation to a nonexistent first line. Comments (with any
indentation) may occur in the middle of continued lines and are duly
ignored.

THE PROCESS

 Portnanny operates by sorting each new connection into one or more
classes, and then checking any limits and doing any actions specified
for the classes that the connection is a member of. A new connection
can be a member of one class, many classes, or of none; if it is a
member of more than one class, it must satisfy the limits of all of
the classes it is a member of.

 Portnanny only performs one action for new connections. If a new
connection is a member of more than one class that wants to do an
action (usually starting a program and handing the connection to it to
deal with), the class who's action is carried out is the one that was
matched earliest.

 It is possible for a new connection to be a member of no classes, or
to be a member of no classes that specify any action to do. If this
happens, portnanny closes the connection. In general, portnanny must
be actively told to do something; if it is told nothing, it does
nothing, and if it does nothing it drops the connection.

 Portnanny's two major files each control one half of this process.

 The rules file specifies how new connections are sorted (aka
classified) into classes. Each line is a rule; if a new connection
satisfies the rule, it is a member of the rule's class. A single class
can have more than one rule in the rules file.

 The actions file specifies what limits apply to members of classes
and what should be done for new connections that fall into each
particular class. Because it makes no sense to specify several
versions of what to do or what limits to apply for the same class
(which set should portnanny pick?), classes can only be specified in
the actions file once.

 Not all classes that are mentioned in the rules file have to be in
the actions file. A class that isn't in the actions file imposes no
limits and causes no actions to happen for new connections that are
members.

 Portnanny maintains a count of how many active connections are
members of each class as well as how many active connections there are
from a given IP (regardless of what classes they are members of;
connections from the same IP address are not necessarily all members
of the same class). These counters are used to enforce connection
limits and are maintained dynamically as new connections are made and
old connections die off.


THE CONFIGURATION FILE

 Portnanny's configuration file is named on the command line. It
contains a series of directives, each of which has exactly one
argument.

 The directives, and their arguments (given in uppercase) are:

	rulefile FILE		Gives the filename of the rules file
	actionfile FILE		Gives the filename of the actions file

	listen PORT[@IP]	Portnanny will listen for new connections
				on the given port and IP address. If
				no IP address is given, portnanny will
				listen for new connections on the given
				port on all IP addresses of the machine.

				Portnanny accepts 'PORT@' and 'PORT@*'
				as synonyms for a bare 'PORT'. PORT must
				be a number.

				You may specify multiple listen
				directives. 

	user USERNAME		Portnanny will change to the user ID
				and groups of this user right after
				it has set up the TCP/IP ports to
				listen on. Failure to successfully
				do so immediately aborts the program.
				The process after 'user USERNAME' has
				the same UID and set of groups that the
				user would normally get through logging
				in or using 'su'.

	dropipafter TIMESPEC	If an IP address has not made a new
				connection to us for TIMESPEC, discard any
				information we have been maintaining about
				the times of its previous IP connections.
				The default is that time information
				about such connections will never be
				expired if portnanny is maintaining
				it. (See later for a discussion of when
				portnanny keeps such information.)
	expireevery TIMESPEC	Check for things to expire no more than
				every TIMESPEC amount of time. The default
				is 60 seconds (if there is any point,
				ie if dropipafter is specified). A value
				of '0' means 'at every connection'.

				TIMESPEC is a number with 's', 'm',
				'h', or 'd' after it, to indicate that
				many seconds, minutes, hours, or days.

	onfileerror WHAT	What to do if the actions file or the
				rules file contain an error when they
				are reloaded. May be 'use-old' or 'drop';
				'use-old' is the default.

	substitutions [on|off]	If 'on' (the default), string
				substitutions in the actions file are
				enabled. If 'off', they are disabled.
				See the discussion of the actions file
				for details. 'off' is for cautions people
				who want safety.

	maxthreads NUMBER	Portnanny can use up to NUMBER threads
				to evaluate rules for new connections in
				parallel, instead of having to evaluate
				each new connection one after the other
				in a main process. The default is zero
				(threading is disabled).
	aftermaxthreads CLASS	When portnanny is over the thread maximum,
				new connections will be immediately made
				members of CLASS without evaluating any
				rules. If unset, portnanny evaluates
				rules for the new connection in the
				main process, as if threading was disabled.

 When portnanny reloads files, it does so as a unit. If a new version of
a file contains an error, no information from the new version will be
used -- even if the error was at the very end of the file.

 With 'onfileerror use-old', portnanny does not discard the old version
of a file until it has successfully loaded the new version. This means
that portnanny will continue dealing with new connections under the old
rules until you fix the error.

 With 'onfileerror drop', portnanny immediately discards the old version
of a file when it tries to reload a new version of it. If the new file
fails to reload, portnanny acts as if the file was empty and all further
connections will be dropped until the file is corrected and loads
successfully.

 Note that the rules and the actions files are loaded separately. It is
possible to update both and have one load and the other contain errors;
in that case, under 'use-old' portnanny will try to use the new version
of one with the old version of the other, which may not work too well.

 'use-old' is more resilient in the face of a stream of minor updates; a
stupid typo will not kill the service entirely. 'drop' is more paranoid;
you are guaranteed that either your update took or portnanny is not
serving anything to anyone.

 The configuration file must name the rules file and the actions file,
and have at least one 'listen' directive. Everything else is optional.

 You cannot have both 'listen PORT@*' and 'listen PORT@IP' for the
same port; the operating system won't accept it. (The reasons are
complex but good.) If you do this in a portnanny configuration you
will get an error at startup time.

 If portnanny is not listening at a specific PORT@IP combination,
people connecting to there will get 'connection refused' errors.
This both consumes less resources and is somewhat more friendly
to users than having portnanny listen to 'PORT@' and immediately
drop connections that are made to 'PORT@undesired-IP'; it also
immunizes you against rules mistakes allowing undesired access.
'listen PORT@127.0.0.1' is often used to create a TCP service
that can only be connected to from the machine itself.

 Threading is used to make portnanny respond faster to multiple new
connections that are made closely together. It is discussed in more
detail later. By default, portnanny does not use threading; this is
appropriate for many relatively simple configurations or low-load
services.


THE RULES FILE

 Lines in the rules file have the general form:

	CLASSNAME[/NOTES]: EXPRESSION

 There are four valid notes: /always, /nt, /nonterminal (equivalent to
/nt), and /label[=LABEL]. If /label is given with no LABEL, the LABEL
is taken to be EXPRESSION.

 Portnanny decides what classes each new connection is by evaluating
each line's expression; if it is true, the new connection is a member
of the class CLASSNAME and the rule is said to have matched.

 Portnanny usually stops evaluating rules once one has matched. There
are two exceptions: rules marked /nt do not stop the evaluation
process, and rules marked /always are always evaluated. The exception
to the exception is that once a connection is a member of a class, no
further rules for that class will be evaluated.

 Thus the result of rules evaluation is a list of classes that the
connection is a member of, in the order that they matched; this list
can be empty. If the connection is a member of at least one class,
portnanny also makes it a member of the pseudo-class 'GLOBAL', putting
this class at the end of the list.

 Portnanny maintains some information about which specific rule
matched to make the connection a member of a given class: the line
number of the rule and the label associated with the rule (if it has
one). How these pieces of information can be used is covered later.

 Before we get into the specifics of rule expressions, we should say
that in general portnanny expressions attempts to look a lot like
tcpwrappers expressions. For normal basic cases involving IP addresses
and hostnames (and not using the additional expression power of
portnanny) you should be able to write expressions which are pretty
much the same. Possibly completely identical.

 Rules file expressions are made up of operands joined together with
operators. In order of precedence (high to low), the operators are:
	( .. ), !/NOT, or-list, AND/&&, EXCEPT
Operator parsing and association is left to right, so 'a EXCEPT b
EXCEPT c' is '(a) EXCEPT ((b) EXCEPT (c))'. An 'or-list' is simply a
list of operands, which is true if any of them are; thus 'a b c d
EXCEPT e f g' is '(a b c d) EXCEPT (e f g)', and is true if any of a,
b, c, or d is true and none of e, f, or g is.

 Portnanny only evaluates as much of the expression (and of bits of
the expression) as it needs to do in order to determine whether it is
true or false. This is usually unimportant but becomes relevant in
some cases.

 Rules are normally broken into words (the operators and operands) by
spaces, except that !, (, ), and && are special and are recognized
inside normal words; this allows '(a&&b)' to be what you expect. Things
may be quoted with single quotes; to put a single quote inside a quoted
object, use '' (two quotes). Quotes do not break words, so: a'&&'b,
a'foobar'b, a' 'b, and 'a b c d e f' are each one word. To use an
operator as a normal word, you must quote it.

 Operands evaluate to either true or false based on some
characteristic of the new connection, and come in three forms:
	MATCHER: ARGUMENT
	MATCHER
or	ARGUMENT
If the operand is a bare argument, portnanny tries to see if the
matchers for IP addresses or hostnames think it is a valid IP address
match or hostname match. (It is an error if neither of them like it.)

 Some matchers take no argument; they are 'ALL', 'UNKNOWN',
'PARANOID', 'KNOWN', and 'IDENTD' (by convention, inherited from
tcpwrappers, no-argument matchers are in upper case).

 ALL is always true.
 IDENTD is true if identd (aka auth) data is available for this
connection.

 UNKNOWN, KNOWN, and PARANOID are the same as in tcpwrappers. KNOWN is
true if there is a validated hostname for the remote IP address;
UNKNOWN is true if there is no hostname for the IP address; PARANOID
is true if either the claimed hostname doesn't exist or the claimed
hostname lists IP addresses that do not include the remote IP
address. Note that 'NOT KNOWN' is '(UNKNOWN PARANOID)'.

 The remaining matchers all take an argument and match something
against it.

	identd:		- matches the username returned from an identd
			  query. (Implicitly implies IDENTD.)

	class:		- matches against the classnames of already
			  matched rules. Clearly, either that rule was
			  /nt or this one is /always.

	forwhn:		- matches if the remote IP address is one of
			  the IP addresses returned when we look up
			  the argument as a hostname.

	answerson: PORT	- a TCP connection can be established to
			  port PORT on the remote machine (PORT must
			  be a number).

	ip: ADDRSPEC

 This matches the remote IP address. ADDRSPEC can be any of the
following:
- an IP address.
- a partial IP address that ends in a '.', such as '128.100.'; this
  matches any IP address that starts with that, in the style of
  tcpwrappers.
- a CIDR netblock, such as '127.0.0.0/24'.
- an explicit range, specified with a dash:
  '127.100.0.0-127.100.1.53'.

 A CIDR netblock must be 'proper', which is to say that the listed IP
address must be the start of the appropriately-sized CIDR netblock
containing it. '127.0.0.1/24' is not a proper CIDR netblock; a /24
covering 127.0.0.1 would start at 127.0.0.0. Portnanny rejects
improper CIDR netblocks because there are at least two possible
interpretations for what they mean.

	localip: ADDRSPEC

 This is like ip: except for the local IP address instead of the
remote IP address.

	hnstatus: STATUS

 This is true if the hostname lookup status is the given
status. Possible statuses are
- 'good': equivalent to KNOWN
- 'unknown': equivalent to UNKNOWN
- 'noforward': the IP address claims a hostname, but the hostname
  doesn't exist.
- 'addrmismatch': the IP address claims a hostname, which exists, but
  the hostname does not list the IP address as one of its IP
  addresses.

 '(hnstatus: noforward hnstatus: addrmismatch)' is the same as
PARANOID.

	hostname: HOSTARG

 This matches the hostname of the remote IP address. If KNOWN is not
true, this automatically fails. Otherwise it may be a full hostname or
it may be a domain postfix starting with a '.', tcpwrappers style. An
argument starting with '.' matches if the hostname ends with that (so
'.foo.com' matches 'a.foo.com') or if the hostname equals the argument
minus the dot (so '.foo.com' also matches 'foo.com').

 All hostname comparisons are case-independent.

 If the operand specifies no matcher, ip: and hostname: are tried in
that order. This allows one to write tcpwrappers-like expressions
like '128.100. 127. .utcc.utoronto.ca'.

	re: REGEXP

 This matches the regexp against the hostname of the remote IP
address; like hostname:, this always fails if KNOWN is not true.
The match is case-independent and unanchored (if you want to match the
beginning or end of the hostname, you must use '^' and '$'). The
regexp is a Python regexp, essentially equivalent to Perl regexps.

	claimedhn:
	claimedre:

 THESE ARE DANGEROUS.
 These are just like hostname: and re:, but they apply to the
'claimed' hostname. The claimed hostname is the unverified result of
the IP to name lookup of the remote IP address, and thus these differ
from the standard hostname:/re: lookups only if PARANOID is true.

 Since the claimed hostname may be totally under the control of the
person connecting to you, it should not be trusted to authorize
connections. However, the author finds it convenient to have it
available to *reject* some connections, as some nameless large ISPs
have useful IP to name information but have not bothered setting up
the corresponding name to IP mappings.

 Plus, you can use it to blow off anyone who thought it was a clever
idea to have their IP to name mapping resolve to an IP address.

	local:	[PORT][@][IP]

 This is true if the local IP and port match those specified. Both IP
and PORT must be numeric. Either (but not both) can be missing or '*',
in which case they match all values; thus 'local: 119@' matches all
connections to local port 119.

 The 'local:' directive is how you tell apart which local port the
connection is for, if portnanny is listening on multiple local ports.

	dnsbl: DNSBLDOM[/IPADDR]

 This matches if the remote IP address is found at the DNS blocklist
rooted at DNSBLDOM, such as 'sbl.spamhaus.org', using the standard
query form of name to IP address lookup on the reversed IP address. If
the optional IP address is specified, this only matches if the DNSBl
returns the IP address among the results of the query.

 Some DNSBls use returned IP addresses to distinguish sub-categories
of their DNSBl. /IPADDR allows you to make use of this information,
but is otherwise not needed.

	firsttime
	seenwithin: TIMESPEC
	notseenfor: TIMESPEC
	stallfor: TIMESPEC
	waited: TIMESPEC

 TIMESPEC is a number suffixed by 's', 'm', 'h', or 'd', meaning that many
seconds, minutes, hours, or days ago.

 firsttime is true if this IP address is connecting to portnanny for
the first time on record.

 seenwithin: and notseenfor: are true if the remote IP address last
connected to us within TIMESPEC or not within TIMESPEC, respectively.

 stallfor: and waited: are true if the remote IP address first
connected to us within TIMESPEC or not within TIMESPEC, respectively.

 Portnanny only records connection time information for connections
that have one of these matchers applied to them. Keeping this
information for all connections would cause portnanny to grow its
memory use based on the number of distinct IP addresses that connected
to it, regardless of whether the information on their connection time
was of any use.

 If the configuration file specifies a 'dropipafter' directive,
portnanny will periodically discard all cached information for IP
addresses who have not connected to us within that many seconds (ie,
IP addresses for whom 'notseenfor: EXPIRETIME' would be true).  If one
of those IP addresses connects again, it will be seen as a first time
connection and so on.

 In addition, portnanny makes no attempt to preserve this information
if it is stopped and restarted.


HOST INFORMATION LOOKUP

 Some information can be expensive to look up. Determining the
hostname can take DNS queries; determining the identd information
requires making a connection to the target machine; and we've just
discussed the overheads of first and last connection time.

 In order to run as fast as possible, portnanny only looks up such
expensive information if it is explicitly asked for. For example, if
all of your rules match only IP addresses, portnanny will never
perform hostname lookups.

 While this is not important for rules (portnanny materializes any
information necessary), it affects the information available for
logging. In the above example, logging would never record a hostname
(even if a valid one was available for some of the IPs).

 Recall that portnanny evaluates only as much of the rules as
necessary in order to determine whether or not they're true. This
means that if a particular lookup is in a rule but not necessary to
determine the rule's result, it will not be performed. For example:
	example:	128.100. identd: bad-idea
will not perform an identd lookup for a connection for a connection
from a 128.100.*.* IP address. This means that later logging of
information about connections that are members of 'example' may or may
not have identd data available for them. 

 At the same time, this can be exploited (even within one rule) to
avoid more expensive lookups if they're not needed. For example,
listing all IP matches before hostname matches will avoid potential
lookup delays on hostname lookups. As another example, there is a
potential big speed difference between
	example1:	identd: cks AND 128.100.
and	example2:	128.100. AND identd: cks
for connections that are not from 128.100.*.*. On the other hand, if
you want identd data to be available for all connections if possible,
the example1 version is obviously better.

 As a corollary, recall that after a class matches once, no further
rules for it are evaluated (no matter how they're noted). This is
importantly different from 'no duplicate class matches are added to
the list of matches', because it means that their expressions are not
evaluated at all, which means that they do not cause (or force) any
additional information lookups.

 If you want to force certain information to be available for logging
purposes, include /nt or /always rules for otherwise unused 'dummy'
classes that force it to be looked up.

THREADING

 Evaluating rules to determine what classes a new connection should be
sorted can require portnanny to pause to look up information about the
remote machine, such as its hostname. These pauses can potentially be
quite long.

 By default, portnanny is single-threaded: it deals with one connection
at a time, no matter how long that takes. This is simple and reliable
and works on all Unix systems, but means that if there are several new
connections at nearly the same time, the unlucky ones may have to wait
quite a while to get dealt with.

 In order to support high-load, high-demand services with multiple,
time-consuming lookups (such as checking several DNS blocklists),
portnanny supports evaluating rules in multiple threads. However, this
may not work on all Unix systems (it is known to work on Linux).

 In order to not overwhelm systems, portnanny limits the maximum
number of threads that can be evaluating rules at any one time via
the 'maxthreads' configuration parameter (and a command-line switch).

 When portnanny has reached this limit its behavior depends on whether
or not 'aftermaxthreads' was set. If it was not set, portnanny
temporarily behaves as if it was single-threaded: the main process
evaluates the new connection, and all further ones have to wait until
that's done.

 When 'aftermaxthreads' is set and portnanny has hit the threads limit,
portnanny immediately sorts the new connection into the named class, as
if it had been matched by a rule in the rule file (thus, it actually
matches the class and 'GLOBAL'). It then continues on normally to look
for actions to do and so on. This can be used for such things as giving
an error message during overload conditions. It can also be used to
immediately drop such connections, either explicitly by having the class
in the action file drop them or implicitly by not having that class
defined in the actions file.

 Threaded rules evaluations have several unobvious consequences; all are
differences from the single-threaded case.

 First, while the version of the rules file that a new connection uses
is frozen at the time it starts rules evaluation, no matter how long
that takes and how many times the rules file changes in the mean time,
the actions file is *not* similarly frozen. This can significantly
lengthen the amount of time where it's possible to have portnanny use
mismatched rules and actions files.

 Second, because portnanny freezes the version of the rules file when
rules evaluation starts, it cannot immediately release that memory when
a new version of the rules file is loaded. Each version of the rules
file can only be freed up when the last thread evaluating a connection
uses it exits. If you are unlucky and change the rules file repeatedly
very rapidly, portnanny's memory usage may temporarily increase quite
significantly.

 Because of the above, the effects of 'onfileerror drop' and a bad
rules file may take a bit of time to take full effect. On a very active
service, you may see a number of connection be logged as successfully
processed even after the reload error message. However, because it
isn't frozen, an erroneous actions file will take immediate effect.


THE ACTIONS FILE

 The actions file is used to specify the limits that apply to
classes and what action should be performed when a new connection
is a member of the class. Each entry is said to be the 'action rule'
for a particular class.

 Lines in the actions file have the general form

	CLASSNAME: DIRECTIVE [ARGS] [ : DIRECTIVE ARGS ...]

 Because it specifies per-class information, no class can be named
more than once in the actions file, and most directives can only be
given once per class.

 Directives are separated from each other by a colon with whitespace
on both sides; 'a : b' is valid, but 'a: b' does not separate two
directives. This is necessary to allow simple use of embedded :'s in
action rules.

 An action rule must have at least one directive (otherwise it's
pointless and is assumed to be an error).

 The available directives are:

	run CMDSTRING
	msg MESSAGE
	failrun CMDSTRING
	failmsg MESSAGE
	drop
	reject
	quiet
	log [MESSAGE]
	faillog MESSAGE
	norepeatlog
	record MESSAGE
	ipmax NUM
	connmax NUM
	setenv VARNAME STRING
	subst NEWNAME STRING
	see CLASS

 'ipmax' and 'connmax' specify maximum connection limits. The first
limits the total number of connections from this IP; the second limits
the total number of connections that are members of this class (or
were at the time they were made and classified, as the classification
rules may have changed since then). If unspecified on a class's action
rule, no limit is imposed by this class.

 If a class imposes a connection limit and a new connection would go
over it, the class is said to refuse the connection. A connection
limit of 0 (or below) causes all new connections to be refused.

 If no class that a new connection is a member of refuses it, the
connection is said to have been accepted.

 'reject' causes this class to refuse all connections. The same thing
can be accomplished with 'ipmax 0' or 'connmax 0', but portnanny uses
different default log messages when 'reject' is specified, and you can
specify 'reject' as well as 'ipmax' and 'connmax' (this is handy for
temporarily disabling connections for classes).

 'failrun' or 'failmsg' describe what this class wants to happen to new
connections if this class refuses the connection. 'failrun' causes a
program to be started and the connection passed to it; 'failmsg' causes
the message to be sent to the new connection and then the connection
closed. If neither is specified when the class refuses a connection,
the connection will just be closed silently (this is the most efficient
thing for portnanny to do; 'failmsg' takes somewhat more work and causes
somewhat more load on the server).

 Similarly, 'run' or 'msg' describes what this class wants to happen
to accepted connections: run a program or send out a message.

 No class can have both 'run' and 'msg' directives specified, or
'failrun' and 'failmsg', because portnanny has no way of deciding
which action would win.

 'drop' means that this class wants accepted connections to be quietly
closed. 'drop' can be specified alongside 'run' or 'msg' and takes
precedence over them.

 If the class specifies none of 'run', 'msg', or 'drop' it has no
particular action it wants taken on new connections.

 'setenv' creates new environment variables and values that will be
passed to any program started via 'run' or 'failrun'. A class can
have multiple 'setenv' directives, but you can't setenv the same
environment variable more than once.

 'subst' creates additional string substitutions that will be available
in any string substitutions done for this class, *excluding* 'record'
directives. A class can have multiple 'subst' directives, but (like
setenv) you can't subst the same name more than once. Built-in string
substitutions override 'subst' definitions of the same names, causing
the 'subst' provided values to be silently ignored.

 'see' names another class to look at for any information that this
class does not supply itself. The effect is as if this class included
the text of the 'see''d class's directives (and in turn, any classes
that that class 'see''s). If several classes in a chain of 'see's
have the same directive, the first one the occurs is the one used;
in 'A: see B : ipmax 10' and 'B: ipmax 5', the 'ipmax' value used
for connections that are members of A is 10, not 5.

 Obviously there cannot be loops of 'see' references ('A: see B' and 'B:
see A'). Portnanny also insists that all classes named as the target
of 'see' directives exist. However, they can be in any order in the
actions file; there is no requirement to define classes that will be
'see' targets earlier in the file than the classes that 'see' to them.
(Indeed, it may make sense to define them later.)

 'see' combines 'subst' and 'setenv' directives in multiple classes
by using all of them, merged together. If multiple classes try to
set a value for the same environment variable or the same new string
substitution, only the first one is used: 'A: see B : setenv a b'
and 'B: setenv a c' will result in the environment variable 'a'
having the value 'b'. 

 'faillog' sets the message to be logged if this class is the one that
refused a new connection. If 'faillog' is not set and this class
refuses a new connection, portnanny will look for a default message to
log in a number of places (described later).

 'quiet' suppresses looking for default 'faillog' messages. 'quiet'
does not suppress following 'see' directives.

 'log' or 'log MESSAGE' sets the message to be logged if this class
specifies the action to be taken for a new connection. If no message
is specified, portnanny uses a default message. In the absence of
'log', portnanny logs no message in this situation; this is unlike
'faillog', because success is considered to be an ordinary situation
not worth cluttering up logs with. (The author runs reasonably active
systems and so does not want to get a message per connection in his
logs.)

 'record' sets a message to be logged any time a new connection is a
member of this class.

 'norepeatlog' means that this class doesn't want to bother logging a
log message if it's the same as the last log message that portnanny
logged. A 'log message' here is one of 'log' or 'faillog'; which one
doesn't matter, only that the message is the same. In particular,
messages from 'record' are completely unaffected by 'norepeatlog'. The
net effect is that portnanny will log only the first of a stream of
'log'/'faillog' messages from a stream of connections, discarding all
of the rest; this is especially handy with 'faillog' if you are dealing
with some people who retry their connections very rapidly.

 When portnanny starts commands from 'run' (or 'failrun') it does so
by taking the argument string, splitting it apart on whitespace, and
directly executing the result; the program will get each word as an
argument. If you need to run a program that requires more sophisticated
arguments, or something such as output redirection, wrap it in a shell
script and have portnanny run the shell script. For security reasons,
this split is done before string substitution.

 Programs started by portnanny are started in the fashion of inetd,
xinetd, and so on: the new connection is their standard input, standard
output, and standard error. Portnanny tries to make sure that they're
not passed other stray file descriptors; in particular, it guarantees
that the network sockets that portnanny itself is listening on are not
passed on to children.

 If a class refuses a connection and does not have settings for them
itself, portnanny will look for default 'faillog' and 'failmsg' settings
from several places; the first one that is found is used. The first
one is the settings for the class 'DEFAULT-<rejtype>', if it exists,
where <rejtype> is an upper-case version of why the class refused
the connection: 'IPMAX', 'CONNMAX', or 'REJECT' (the class specified
'reject'). The second is the settings of the class 'DEFAULTMSGS', again
if it exists.

 If it still doesn't have a 'faillog' message, portnanny falls back to
internal defaults (using one for 'reject' and another for connection
limits). There is no internal default for 'failmsg', and a default
'failmsg' is not looked for if the class has specified a 'failrun'.

 These default message source classes are not special in any other way,
although the author does suggest that you avoid using them as normal
classes in your rules files.


	DECIDING ON THE ACTION FOR A NEW CONNECTION

 As previously mentioned, the result of evaluating the rules file for a
new connection is a list of classes that the new connection is a member
of (plus 'GLOBAL'), in the order that they were first matched in the
rules file.

 Portnanny takes the list and looks up each class in the actions file.
Some of them may not be specified in the actions file (for example, you
may have no GLOBAL class defined), in which case they are ignored during
further processing.

 For each class that it finds, portnanny checks to see if the class
refuses the connection (whether because of 'reject' or because of
connection limits). If it does, the class that refuses the class is
said to be the 'action class'.

 If no member class refuses the connection, the connection is successful
and portnanny looks the classes up again to find the first class that
specifies an action to take for successful connections (one of 'drop',
'run', or 'msg'). If there is such a class, it is said to be the 'action
class'.

 If there is an 'action class' it is used to determine what action to
perform and what messages to log about it. Only the 'log' or 'faillog'
values of the *action class* are used; if a class is not the action
class, only a 'record' directive will cause a message to be logged.
Similarly, only the 'failrun' or (possibly defaulted) 'failmsg'
values of the class that refused the connection are used on refused
connections; if it has neither, portnanny takes its default 'nothing to
do' action and closes the connection.

 A connection will be counted against connection maximum limits if is
performing *either* a 'run' or a 'failrun' action. If you use 'failrun',
this implies that the connection maximums are not an upper bound on the
number of processes that portnanny starts; see the discussion on load
limiting, later.

 If there is no action class, because no class refused the connection
but no class specified an affirmative action for new connections,
portnanny falls back to the default behavior of 'well, do nothing' and
closes the connection.

 The GLOBAL class is generally intended to be used to implement global
connection limits. Because it is treated as a normal class that
the connection is a member of, it is possible to use it to create
(implicitly default) actions for successful connections; however, this
is somewhat dangerous -- effectively you are defaulting to portnanny
performing an action instead of doing nothing. (It is not quite
this, because new connections only become a member of GLOBAL if they
are already members of at least one other class through rules file
evaluation.)

LOAD LIMITING AND A CAUTION ON 'FAILRUN'

 Obviously, 'failrun' runs a program. Because it runs a program when the
connection is *refused*, there is very little load limiting possible.
All you can do is insure there is always an earlier matching class with
a higher load limit and no 'failrun' that will at least cut in when you
are being overwhelmed. If you do not do this, an attacker can cause
you to start a potentially unbound number of copies of your 'failrun'
program.

 The most efficient thing to do to unwanted connections is to have
portnanny just close them, either with 'drop' or with the implicit
do nothing action of the refusing class not supplying a 'failmsg' or
'failrun'.

 Using 'msg' or 'failmsg' requires portnanny to fork to insure that
a slow network to the new connection would not stall other people as
portnanny tried to write out the message. Portnanny takes some steps to
limit the amount of time that message-writing children will stick around
(possibly not enough).

 'run' / 'failrun' are the most resource-consuming, since they require
starting an entire separate program.


STRING SUBSTITUTION & EXPANSION

 The arguments to log, faillog, failmsg, msg, run, record, failrun,
and the values of setenv's environment variables (but not their names)
and subst's values (but not its names) are all run through an internal
process of string substitution and expansion before use.

 This process is designed to allow the inclusion of connection specific
information in these places; log messages, arguments to programs
being started, and so on. Note that for security reasons, the command
string for 'run' and 'failrun' is split into arguments before string
substitution, not after it, so spaces in a substitution will never
create additional command arguments.

 String substitution based on a printf-like expansion of string format
operators starting with %. The fully general form is '%(THING)<format
spec>', but normally you are expected to use '%(THING)s' where THING
is one of the following:

	ip		- remote IP address
	remport		- remote TCP port number
	localip		- local IP address
	port		- local TCP port number
	hnstatus	- what 'hnstatus:' matches against.
	claimedhn	- the claimed remote hostname
	hostname	- the remote hostname, or the same as ip.
	identd		- the identd information
	seensince	- seconds since the first connection from this
			  IP address.
	lastseen	- seconds since the last connection from this
			  IP address.

	connsum		- a pretty-printed summary of information
			  about this connection, roughly
				%(identd)s@%(hostname)s
			  except that the '%(identd)s@' is only
			  present if identd information is valid.
	connipsum	- as connsum, but instead of the hostname the
			  IP address is always used.

	class		- the name of the action class.
	lineno		- the (starting) line number in the rules file
			  of the rule that matched this class.
	label		- if the rule in the rules file that matched
			  this class specified a /label=LABEL note,
			  this is 'LABEL' with underscores ('_')
			  changed to spaces (' ').

	cr		- a \r
	nl		- a \n
	eol		- \r\n

	limit		- either 'ipmax' or 'connmax' depending on
			  which connection limit rejected the
			  connection.

[Plus any additional string substitutions defined by 'subst' directives
 in the action class.]

 Not all of these substitutions are defined all of the time.

 hnstatus and claimedhn are only present if hostname lookup has been
done. (hostname is always present; if hostname lookup has not been
done or has failed, its value is the IP address.)

 identd are only present if identd lookup has been done.
 seensince and lastseen are only present if time-based lookups have
been done in the process of evaluating rules.

 label is only present if the relevant rule specified a /label= note.

 limit is only present if the connection was rejected by connection
limits.

 Use of any of these when they are not defined will cause an error to
be logged and the connection will be dropped, as will use of any
unrecognized substitution string.

 'subst' directives have their value string-substituted only once, when
they are initially processed.

 Note that 'subst'-provided substitutions are only overridden by
built-in substitutions *if the built-in substitution is defined*. Thus:
	A: subst identd UNKNOWN : run whatever :
	   log Got a connection from %(ip)s with identd %(identd)s
will *not* error out if identd information was not available for the
connection; it will instead log it as 'UNKNOWN'. If identd information
is available, the built-in 'identd' is defined and overrides the subst
directive.

USE OF SUBST AND SEE

 Consider the following actions file definitions for an example of
use of 'subst' and 'see':
	bl-a:	subst blmsg On DNSBL http://foo.bar : see DNSBL
	bl-b:	subst blmsg http://baz.orp lists you as an open proxy
		: see DNSBL
	bl-c:	subst blmsg Please read http://dnsbl/FAQ : see DNSBL
	bl-d:	subst blmsg Pleass see http://dnsbl2/query?ip=%(ip)s
		: see DNSBL
	DNSBL:	faillog refused %(connsum)s: DNS blocklist class %(class)s
		failmsg 500 Sorry, %(ip)s. Connection refused. %(blmsg)s
		: reject : norepeatlog

 This condenses the redundant information in class action rules down
to a minimum, and uses 'subst' to define a per-DNS-blocklist block of
explanatory text that will be inserted into the failure message that
appropriate connections will receive. Note that the DNSBL meta-class
has been used to define not just messages, but also the 'repeat' and
'norepeatlog' clauses that apply to all DNS blocklist entries.

 In addition, this shows (in bl-d) the new 'subst' text itself
being expanded before it is defined, and why you might want to
use something like that.

 Despite the 'faillog' directive being on the DNSBL class, the log
entries will refer to 'DNS blocklist class bl-a' and the like, because
portnanny considers the log message to have been generated by the 'bl-a'
class, not by the DNSBL class. (This is the same thing as happens with
'DEFAULTMSGS' et al, of course.)

 As a bonus wizard feature, the following actually works correctly:
	A: see C : subst extra we are coming from %(hostname)s in A
	B: see C : subst extra we are coming from B
	C: see E : subst info %(extra)s and from C too
	D: see E : subst info because we came from %(ip)s
	E: reject : faillog Failed in E: %(info)s

The process of merging multiple 'subst' values across 'see's defines the
earlier substitutions (such as A's) before the added substitutions in
later classes (such as C) are expanded, so C's 'info' can use the new
'extra' substitution defined in A and B. Of course, if C is ever used
directly there will be explosions because the '%(extra)s' expansion
it is counting on is not present. (Note that in this example 'D' can
be used directly, as its own definition of 'info' doesn't expect
anything.')

USE OF /LABEL AND MULTI-RULE CLASSES

 The best documentation is an example. Imagine that you are interested
in identifying and rejecting connections to some service from various
people's dialup pools. For example:
[rules sample]:
	dialups/label=dipDE:	.dip.t-dialin.net
	dialups/label=AOL:	.ipt.aol.com
	dialups/label=liwest.at:  re: ^cm[0-9]+-[0-9]+\.liwest\.at$
	dialups/label=SORBSDUL:	dnsbl: dul.dnsbl.sorbs.net
[actions sample:]
	dialups:	reject :
			failmsg 500 Dialups not accepted, %(ip). :
			faillog rejected %(connsum)s as dialup,
				decided by %(labels)s 

 You can then use the logs to not only see what is rejected, but see
what sources of dialups are connecting to you.

 While you could use %(lineno)s for this, %(label)s both has more
mnemonic value in logs and summaries and is much more stable over
time and editing the rules file.