Modern Linux can require a link signal before it configures IP addresses

April 6, 2021

I recently had an interesting troubleshooting experience when an Ubuntu 18.04 Dell server would boot but not talk to the network, or in fact even configure its IP address and other networking. I was putting it into production in place of a 16.04 server, which meant I had changed its netplan configuration and recabled it (to reuse the 16.04 server's network wire). I spent some time examining the netplan configuration and trawling logs before I took a second look at the rear of the machine and realized that when I had shuffled the network cable around I had accidentally plugged it into the server's second network port instead of the first one.

What had fooled me about where the problem was that when I logged in to the machine on the console, ifconfig and ip both reported that the machine didn't have its IP address or other networking set up. Because I come from an era where networking was configured provided that the network device existed at all, that made me assume that something was wrong with netplan or with the underlying networkd configuration it generated. In fact what was going on is that these days nothing may get configured if a port doesn't have link signal. The inverse is also true; your full IP and network configuration may appear the moment you plug in a network cable and give the port link signal.

(I think this is due to netplan using systemd's networkd to actually handle network setup, instead of it being something netplan itself was doing.)

People using NetworkManager have been experiencing this for a long time, but I'm more used to static server network configurations that are there from the moment the server boots up and finds its network devices. This behavior is definitely something I'm going to have to remember for future troubleshooting, along with making sure that the network cable is plugged into the port it should be.

This does have some implications for what you can expect to happen if your servers ever boot without the switch they're connected to being powered on. In the past they would boot with IP networking fully configured but just not be able to talk to anything; now they'll boot without IP networking (and some things may wait for some time before they start, although not forever, since systemd's wait for the network to be online has a 120 second timeout by default).

(There may be some other implications if networkd also withdraws configured IP addresses when an interface loses link signal, for various reasons including someone unplugging the wrong switch port. I haven't tested this.)

Written on 06 April 2021.
« A stable Unix updating its version of Go isn't straightforward
Rust's rustup tool is surprisingly nice and well behaved »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Tue Apr 6 23:50:34 2021
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.