2015-06-30
My early impressions of Fedora 22, especially of DNF
I recently updated first my office laptop (which runs a relatively stock Cinnamon environment) and my office workstation (which runs my custom setup) to Fedora 22, both via my usual means of a yum-based upgrade instead of the officially supported FedUp mechanism. I feel kind of ambivalent about the results of this.
On the one hand, the upgrade was smooth in both cases and everything in both of my environments basically worked from the start. This is not always the case, especially in my custom setup; I'm used to having to fiddle things around after Fedora version upgrades in order to get audio or automatic removable media mounting or whatever working again. Instead everything pretty much just went, and nothing changed in my Cinnamon environment.
On the other hand, how can I say this gently: I have not been really impressed with Fedora 22's quality control. A number of things happened to me in and after the upgrade:
- Fedora 22 appears to have shuffled around the
/dev/disk/by-id
names for disks in a way that broke automatic boot time importing of my ZFS pools until I imported them once by hand. I'm not entirely happy with this, but I am running an unsupported configuration. - systemd's networkd exhibited systemd's usual regard for its users by
changing which of several IP addresses would be an interface's
primary IP address. Apparently
my hope was naive.
(This is where some systemd person smugly observes that the order is not documented and so I deserve whatever I get here, including random decisions from boot to boot. See 'usual regard for its users', above.)
- Fedora 22 has a broken rsyslog(d) that will casually and trivially
dump core.
Fortunately the code flaw is trivially fixable; it's a good thing
that I know how to patch RPMs by hand.
An update is coming out sometime, but apparently Fedora does not
consider 'the syslog daemon dumps core' to be a high priority issue.
(This feels like a facet of the great systemd thing about syslog.)
- Fedora 22 now spews audit system messages all over your kernel
logs by default, which is especially fun if you just got rsyslog
working and would like to watch your kernel logs for important
anomalies. I have so far disabled most of this with '
auditctl -e 0
'; my long term fix is going to be adding 'audit=0
' to the kernel command line. I wish I knew what changed in Fedora 22 to cause these messages to start showing up in my configuration, but, well, who knows.(I also modified my rsyslog configuration to divert those messages to another file, using a very brute force method because I was angry and in a hurry.)
- I ran into a gcc 5.1.1 bug.
And then there's DNF, the Fedora 22 replacement for yum. Oh, DNF, what can I say about you.
I believe the Fedora and DNF people when they say that the internals of DNF are better than the internals of Yum. But it's equally clear to me that DNF is nowhere near as usable and polished as Yum and so it has a ton of irritations in day to day usage. My experience with DNF has it slow and balky and erratic as compared to the smooth and working Yum I'm used to, and I've been neither impressed nor enthused about the forced switch. From a user perspective, this is not an improvement, it's a whole bunch of regressions.
On top of that, it's pretty clear that no one has ever seriously used or tested the dnf 'local' plugin, which lets you keep a copy of all packages you install through DNF. I've used the equivalent Yum plugin for years so that I could roll back to older versions of packages if I needed to (ie, when a package 'improvement' has broken the new current version for me). The DNF version has a truly impressive collection of 'this thing doesn't work' bugs. I managed to get it sort of working by dint of being both fairly familiar with how this stuff works under the hood and willing to edit the DNF Python source, and even then it sometimes explodes.
(Many people may not care about this but I actually use yum quite frequently, so a balky, stalling, uninformative, and frustrating version of it is really irritating. Everything I do with DNF seems to take twice as long and be twice as irritating as it was with Yum.)
At this point some people will reasonably ask if upgrading to Fedora 22 was worth it. My current answer is 'yes, sort of, and it's not as if I have a choice here'. To run Fedora is to be on an upgrade treadmill, like it or not, and Fedora 22 does improve and modernize some things. All of this annoyance is just the price I periodically pay for running Fedora instead of any of the alternatives.
(And yes, I still prefer Fedora to Debian, Ubuntu, or FreeBSD.)
The probable and prosaic explanation for a socket()
API choice
It started on Twitter:
@mjdominus: Annoyed today that the BSD people had socket(2) return a single FD instead of a pair the way pipe(2) does. That necessitated shutdown(2).
@thatcks: I suspect they might have felt forced to single-FD returns by per-process and total kernel-wide FD limits back then.
I came up with this idea off the cuff and it felt convincing at the
moment that I tweeted it; after all, if you have a socket server
or the like, such as inetd
, moving to a two-FD model for sockets
means that you've just more or less doubled the number of file
descriptors your process needs. Today we're used to systems that
let processes to have a lot of open file descriptors at once, but
historically Unix had much lower limits and it's not hard to imagine
inetd
running into them.
It's a wonderful theory but it immediately runs aground on the
practical reality that socket()
and accept()
were introduced
no later than 4.1c BSD, while inetd
only came in in 4.3 BSD (which was years later). Thus it seems
very unlikely that the BSD developers were thinking ahead to processes
that would open a lot of sockets at the time that the socket()
API was designed. Instead I think that there are much simpler and
more likely explanations for why the API isn't the way Mark Jason
Dominus would like.
The first is that it seems clear that the BSD people were not
particularly concerned about minimizing new system calls; instead
BSD was already adding a ton of new system features and system
calls. Between 4.0 BSD and 4.1c BSD, they went from 64 syscall table
entries (not all of them real syscalls) to 149 entries. In this
atmosphere, avoiding adding one more system call is not likely to have
been a big motivator or in fact even very much on people's minds. Nor
was networking the only source of additions; 4.1c BSD added rename()
,
mkdir()
, and rmdir()
, for example.
The second is that C makes multi-return APIs more awkward than
single-return APIs. Contrast the pipe()
API, where you must construct
a memory area for the two file descriptors and pass a pointer to it,
with the socket()
API, where you simply assign the return value. Given
a choice, I think a lot of people are going to design a socket()
-style
API rather than a pipe()
-style API.
There's also the related issue that one reason the pipe()
API
works well returning two file descriptors is because the file
descriptors involved almost immediately go in different 'directions'
(often one goes to a sub-process); there aren't very many situations
where you want to pass both file descriptors around to functions
in your program. This is very much not the case in network related
programs, especially programs that use select()
; if socket()
et al returned two file descriptors, one for read and one for write,
I think that you'd find they were often passed around together.
Often you'd prefer them to be one descriptor that you could use
either for reading or writing depending on what you were doing at
the time. Many classical network programs (and protocols) alternate
reading and writing from the network, after all.
(Without processes that open multiple sockets, you might wonder
what select()
is there for. The answer is programs like telnet
and rlogin
(and their servers), which talk to both the network
and the tty at the same time. These were already present in 4.1c
BSD, at the dawn of the socket()
API.)
Sidebar: The pipe()
user API versus the kernel API
Before I actually looked at the 4.1c BSD kernel source code, I was
also going to say that the kernel to user API makes returning more
than one value awkward because your kernel code has to explicitly
fish through the pointer that userland has supplied it in things
like the pipe()
system call. It turns out that this is false.
Instead, as far back as V7 and
probably further, the kernel to user API could return multiple
values; specifically, it could return two values. pipe()
used
this to return both file descriptors without having to fish around
in your user process memory, and it was up to the C library to write
these two return values to your pipefd
array.
I really should have expected this; in a kernel, no one wants to have to look at user process memory if they can help it. Returning two values instead of one just needs an extra register in the general assembly level syscall API and there you are.