2019-06-23
What it takes to run a 32-bit x86 program on a 64-bit x86 Linux system
Suppose that you have a modern 64-bit x86 Linux system (often called an x86_64 environment) and that you want to run an old 32-bit x86 program on it (a plain x86 program). What does this require from the overall system, both the kernel and the rest of the environment?
(I am restricting this to ELF programs, not very old a.out ones.)
At a minimum, this requires that the (64-bit) kernel support
programs running in 32-bit mode ('IA32') and making 32-bit kernel
calls. Supporting this is a configuration option in the kernel
(or actually a whole collection of them, but they mostly depend
on one called IA32_EMULATION). Supporting 32-bit calls on a
64-bit kernel is not entirely easy because many kernel calls involve
structures; those structures must be translated back and forth
between the kernel's native 64-bit version and the emulated 32-bit
version. This can raise questions of how to handle native values
that exceed what can fit in the fields of the 32-bit structures. The
kernel also has a barrel full of fun in the form of ioctl()
, which smuggles a
lot of structures in and out of the kernel in relatively opaque ways. A
64-bit kernel does want to support at least some 32-bit ioctls, such as
the ones that deal with (pseudo-)terminals.
(I suspect that there are people in the Linux kernel community who hope that all of this emulation and compatibility code can someday be removed.)
A modern kernel dealing with modern 32-bit programs also needs to provide a 32-bit vDSO, and the necessary information to let the program find it. This requires the kernel to carry around a 32-bit ELF image, which has to be generated somehow (at some point). The vDSO is mapped into the memory space of even statically compiled 32-bit programs, although they may or may not use it.
(In ldd
output on dynamically linked 32-bit programs, I believe
this often shows up as a magic 'linux-gate.so.1'.)
This is enough for statically compiled programs, but of course very few programs are statically compiled. Instead, almost all 32-bit programs that you're likely to encounter are dynamically linked and so require a collection of additional compiled things. Running a dynamically linked program requires at least a 32-bit version of its dynamic linker (the 'ELF interpreter'), which is usually 'ld-linux.so.2'. Generally the 32-bit program will then go on to require additional 32-bit shared libraries, starting with the 32-bit C library ('libc.so.6' and 'libdl.so.2' for glibc) and expanding from there. The basic shared libraries usually come from glibc, but you can easily need additional ones from other packages for things like curses or the collection of X11 shared libraries. C++ programs will need libstdc++, which comes from GCC instead of glibc.
(The basic dynamic linker, ld-linux.so.2, is also from glibc.)
In order to do things like hostname lookups correctly, a 32-bit
program will also need 32-bit versions of any NSS modules that are used
in your /etc/nsswitch.conf
, since all of these are shared libraries
that are loaded dynamically by glibc. Some of these modules come
from glibc itself, but others are provided by a variety of additional
software packages. I'm not certain what happens to your program's
name lookups if a relevant NSS module is not available, but at a
minimum you won't be able to correctly resolve names that come from
that module.
(You may not get any result for the name, or you might get an incorrect or incomplete result if another configured NSS module also has an answer for you. Multiple NSS modules are common for things like hostname resolution.)
I believe that generally all of these 32-bit shared libraries will have to be built with a 32-bit compiler toolchain in an environment that itself looks and behaves as 32-bit as possible. Building 32-bit binaries from a 64-bit environment is theoretically possible and nominally simple, but in practice there have been problems, and on top of that many build systems don't support this sort of cross-building.
(Of course, many people and distributions already have a collection of 32-bit shared libraries that have been built. But if they need to be rebuilt or updated for some reason, this might be an issue. And of course the relevant shared library needs to (still) support being built as 32-bit instead of 64-bit, as does the compiler toolchain.)
2019-06-12
My weird problem with the Fedora 29 version of Firefox 67
On Twitter, I said:
So the Fedora version of Firefox 67 (or perhaps all versions of Firefox 67) have a little issue where starting Firefox with a URL, as 'firefox SOME-URL', will sometimes start Firefox without loading the web page properly. This is very irritating for me, so back to Firefox 66.
While the effect here is reproducible from the command line (well, for me), it comes up for me because I have a bunch of tools, dmenu setups and window manager automation that ends up doing the equivalent of this. One case is transferring a URL to my Javascript-enabled Firefox profile; the JS-enabled Firefox is not usually running, so the transfer process runs into this some of the time. As you can imagine, trying to open a URL and getting a blank page or some other failure is kind of annoying. I want things to work the first time around, not have to be repeated (occasionally more than once).
I also did some testing of the Fedora Firefox 67 with a completely
new $HOME
(by doing 'export HOME=/tmp/fox-scratch; mkdir $HOME
')
and I got some really weird results by repeatedly starting it with
an URL, quitting, and repeating it. For a while at home, I could
get the Fedora version of Firefox 67 to report that Environment
Canada's website
had an invalid TLS certificate because Entrust was an unknown
CA.
Initially I wasn't sure if this was Firefox 67 in general or not, but yesterday I got sufficiently interested and irritated to fetch the official Mozilla build of Firefox 67 and try it instead ('installed', ie unpacked, into a non-system location). This official version of 67.0.2 works completely fine for me both at home and at work, with all of my normal profiles and extensions, so the problem seems to be in some way specific to Fedora's build of Firefox 67 (I saw it with both 67.0-2 and 67.0-4 on Fedora 29). On the other hand, the Fedora 29 Firefox 67 seems to work fine in my Cinnamon session on my laptop.
(I haven't tried the latest unreleased version of Fedora's Firefox that's available through the Bodhi page for Firefox or via the steps in Fetching really new Fedora packages with Bodhi. As I write this it's only 19 hours old, so I'll let the dust settle on those packages a bit.)
PS: I've not yet upgraded to Fedora 30 for various reasons, but certainly of them is this bug. I suspect it may be July before I make the leap.
PPS: For interested parties, the bug I filed with Fedora is Fedora Bugzilla #1713924.
An interesting Fedora 29 DNF update loop with the createrepo package
For a while, my Fedora 29 home and work machines have been complaining
during 'dnf update
' with a very peculiar complaint:
# dnf update Last metadata expiration check: 0:09:16 ago [...] Dependencies resolved. Problem: cannot install both createrepo_c-0.11.1-1.fc29.x86_64 and createrepo_c-0.13.2-2.fc29.x86_64 - cannot install the best update candidate for package createrepo_c-0.13.2-2.fc29.x86_64 - cannot install the best update candidate for package createrepo-0.10.3-15.fc28.noarch ========= [....] Package Architecture Version Repository Size Skipping packages with conflicts: (add '--best --allowerasing' to command line to force their upgrade): createrepo_c x86_64 0.11.1-1.fc29 fedora 59 k
(Using '--best --allowerasing' did not in fact force the upgrade.)
For a while I've been ignoring this or taking very light stabs
at trying to fix it, but tonight I got irritated enough to finally
do a full investigation. To start with, the initial situation is
that I have both createrepo
0.10.3-15 and createrepo_c
0.13.2-2
installed. DNF is trying to upgrade createrepo
to createrepo_c
0.11.1-1, but this is naturally conflicting with the more recent
version of createrepo_c
that I already have installed.
I don't know quite how I got into the underlying situation, but I
believe that interested parties can reproduce it on a current Fedora
29 system (and possibly on a Fedora 30 one as well) by first
installing mock
, which pulls in createrepo_c
0.13.2-2, and
then the older and now apparently obsolete mach
, which will pull in
createrepo
. At this point, a 'dnf update
' will likely produce
what you see here. To get out of the situation, you must DNF remove
mach
and createrepo
(conveniently, 'dnf remove createrepo
'
will ripple through to remove mach
as well).
To start understanding the situation, let's do an additional DNF command:
# dnf provides createrepo createrepo-0.10.3-15.fc28.noarch : Creates a common metadata repository [...] Provide : createrepo = 0.10.3-15.fc28 createrepo_c-0.11.1-1.fc29.x86_64 : Creates a common metadata repository [...] Provide : createrepo = 0.11.1-1.fc29
In the beginning, there was createrepo
, written
in Python, and it was used by various programs and packages that
wanted to create local RPM repositories, including both Mach and Mock. As a result of
this, the Fedora packages for various things explicitly required
'createrepo
'. Eventually the RPM people decided that they needed
a version of createrepo written in C, so they created createrepo_c
. In Fedora
29, Fedora appears to have switched which createrepo implementation
they used to the C version. Likely to ease the transition, they
made the initial version or versions of their createrepo_c RPM
also pretend that it was createrepo
, by explicitly doing an RPM
provides of that name. This made createrepo_c 0.11.1-1 both a
substitute for the createrepo
RPM and an upgrade candidate for
it, since it has a more recent version (this is the surprise of
'Provides
' in RPM).
(The RPM changelog says this was introduced in 0.10.0-20, for which the only change is 'Obsolete and provide createrepo'.)
Over time, most RPMs were updated to require createrepo_c instead
of createrepo, including the mock
RPM. However, the mach
RPM was not updated, probably because Mach itself is neglected and
likely considered obsolete or abandoned. Then at some point the the
Fedora people stopped having their createrepo_c RPM fill in for
createrepo this way. Based on the RPM changelog for createrepo_c,
this happened in 0.13.2-1, which includes a cryptic changelog line of:
- Do not obsolete createrepo on Fedora < 31
Presumably the Fedora people have their reasons, and if I wanted
to trawl the Fedora Bugzilla I might even find them. However, the
effect of this change is that older createrepo_c
RPMs in Fedora
29 are updates for createrepo
but newer ones aren't.
So, if you 'dnf install mock
', you will get mock
and the current
version of createrepo_c
, which doesn't provide createrepo
.
If you then 'dnf install mach
', it requires createrepo
and
the best version it can actually install is the actual createrepo
0.10.3-15 RPM that was built on Fedora 28. However, once that is
installed, DNF will see the 0.11.1-1 version of createrepo_c
from the Fedora 29 release package set as an update candidate for it,
but that can't be installed because you already have a more recent
version of createrepo_c
.
(I suspect that if you install mach
first and mock
second, you
will get only createrepo_c
but will be unable to upgrade it
past 0.11.1-1 without erasing mach
.)