accept(2)'s problem of trying to return two different sorts of errors
A long time ago, I wrote about the dangers of being overly specific
errno values you looked for, with
the specific case being a daemon that exited because an
system call got an
ECONNRESET that it didn't expect. Recently,
John Wiersba left a comment on that entry asking what else the
original programmer should have done, given an unexpected error
accept(). In thinking about the issues, I realized that part
of the problem is that
accept() is actually returning two different
sorts of errors and the Unix API doesn't provide it any good way
to let people tell the two different sorts apart.
accept() is standardized to return
ECONNRESET in these circumstances, although this may
not be universal.)
The two sorts of errors that
accept() is trying to return are
errors in the
accept() call, such as a bad file descriptor (
ENOTSOCK) or a bad parameter (
EFAULT), and errors in the new
accept() may or may not be returning (
ECONNABORTED, etc). One of the differences between the two is that
the first sort of errors are probably permanent unless fixed by the
program somehow and generally indicate an internal program error,
while the second sort of errors will go away if you correctly loop
accept() sequence again.
A sensibly behaving network daemon should definitely not exit when it gets the second sort of error; it should instead just continue on with its processing loop. However, it's perfectly sensible and probably broadly correct to exit if you get the first sort of error, especially if it's an unknown error and you have no idea how to correct it in your code. If someone has closed a file descriptor on you or it's become a non-socket somehow, continuing will generally just get you an un-ending stream of the same error over and over (and burn CPU, and perhaps flood logs). Exiting is a perfectly sensible way out and often really the only thing you can do.
However, you can't reliably distinguish between these two types of
errors unless you believe you can know all of the possible
for one or the other of them. Given the general habit of Unixes of
errno returns for system calls over time, the practical
reality is that you can't. This unfortunately leaves authors of
Unix network daemons sort of up in the air; they have to pick one
way or the other, and either way might give the wrong answer in
accept() should never have returned the second sort of
errors, leaving them all to be discovered on a subsequent use of
the file descriptor it returned. But that ship sailed a very long
accept() returning these sorts of errors is even in
the Single UNIX Specification for
I suspect that
accept() is not the only the only system call with
this sort of split in types of errors (although I can't think of
any others off the top of my head). But thankfully I don't think
there are too many others, because
accept()'s pattern of operation
is an unusual one.
PS: The Linux
accept() manpage actually has a
warning about Linux's behavior here, in the RETURN VALUE section.
Linux opts to immediately return a lot of errors detected on the
new socket, while other Unixes generally postpone some of them. But
note that any Unix can return
Linux network-scripts being deprecated is a problem for my home PPPoE link
The other day, I ran
ifdown on my home machine for the first time
since I upgraded it to Fedora 29 and got an unpleasant surprise:
WARN : [ifdown] You are using 'ifdown' script provided by 'network-scripts', which are now deprecated.
WARN : [ifdown] 'network-scripts' will be removed from distribution in near future.
WARN : [ifdown] It is advised to switch to 'NetworkManager' instead - it provides 'ifup/ifdown' scripts as well.
As they say, this is my unhappy face.
On both my work and my home machines, most of my network configuration
is done through systemd's networkd. However,
at home I also have a PPPoE DSL
link. Systemd (still) doesn't handle PPPoE and I have no interest
in using NetworkManager on my desktop machines,
which means that currently my PPPoE link setup is still done through
the good old fashioned Fedora
system. Since this now seems to be on a deprecation schedule of some
sort (although who knows what 'near future' is here, for Fedora or in
general), I'm going to need to find some sort of a replacement for my
use of it.
In theory this shouldn't be too hard, because after all
ifdown are just shell scripts, and for a DSL link it appears that
most of what they do is delegate things to rp-pppoe's
In practice, these are gnarled and tangled shell scripts, with who
knows what side effects and environment variable settings that
adsl-start and things downstream of it are counting on, and I'm
not looking forward to first reverse engineering all of the setup
and then building an equivalent replacement system, just because
people want to remove network-scripts.
For even more potential fun for me in the future,
are provided both by the network-scripts package and by NetworkManager,
with this managed by Fedora's alternatives system. I suspect that
this means I won't even notice that network-scripts has been removed
until my system's
ifdown invocations start quietly
running NetworkManager and things explode for reasons that I expect
to boil down to 'because NetworkManager'.
(I don't have much optimism about NetworkManager's ability to cooperate with other parties or be modest about what it will do with your network setup; instead my impression is that NetworkManager expects to run all of your networking however it sees fit. So I expect it to try to read random bits of my very historical network-scripts configuration files, interpret them in various ways, and then probably cause my networking to explode. NetworkManager has an ifcfg-rh plugin for this, but I have no idea how well it works and it doesn't seem to support DSL PPPoE at all based on the documentation.)
Sidebar: How I currently have my PPPoE networking wired up
I have a system cron.d file that runs '
ifup ppp0' on boot (via a
@reboot action), and then re-runs it every fifteen minutes if
there's no default route, because sometimes it falls over. In a
more proper systemd world, I guess I should write a service unit
that runs it after my home machine's Ethernet is up and then perhaps
try out a timer unit to handle the 'try again every fifteen minutes'
(I normally strong prefer crontab entries over systemd timer units, but I would be interacting with other systemd units and with the overall systemd state here so timer units are probably better.)