2016-12-05
My RPM build setup no longer works on Fedora for some packages
A decade ago I wrote up how I set up to build RPMs,
with sources for each RPM segregated into their own subdirectory
under your RPM SOURCES
directory instead of everything piled into
SOURCES
the way the default setup wants you to do it. I've used
that RPM building setup ever since and it's worked for me over all
of these years. The short version of what this looks like is:
%_topdir /some/where %_sourcedir %{_topdir}/SOURCES/%{name}-%{version} %_specdir %{_sourcedir}
This results in source paths like /some/where/SOURCES/openssh-7.2p2
.
If you do this and you try to build the current Fedora 24 version of OpenSSH, what you will get is:
$ rpmbuild -bp openssh.spec error: File /u/cks/src/RPM/SOURCES/openssh-0.10.2/pam_ssh_agent_auth-0.10.2.tar.bz2: No such file or directory
The direct problem is that the Fedora OpenSSH RPM essentially
contains two different packages (OpenSSH itself and pam_ssh_agent_auth)
and they have two completely separate versions. When rpmbuild goes
to unpack the source for the latter, it uses the latter's version
number of 0.10.2 as %{version}
in the expansion of %{_sourcedir}
instead of the version number of the main OpenSSH source package
and everything goes off the rails. When I asked on the Fedora
developers' IRC channel, no one had any suggestions for how to fix
this and no one seemed to expect that there were any (for instance,
no one knew of a magic equivalent of %{version}
that meant 'the
version of the source RPM, no really I mean it').
(At this point I will pause to mutter things about RPM's documentation or lack thereof, part of which is RPM's fault and part of which is the fault of distributions, who don't seem to document their distro-specific macros and so on very much. If RPM has good, current, accessible documentation it is not easy to find.)
The bigger problem is three-fold. First, I have no idea how to
actually fix this, although I can probably work around it in various
hack ways to work on this specific OpenSSH package (for example, I
can temporarily take the %{version}
out of my definition). That
assumes that there's even a way to really fix this, which there may
not be. Second, clearly my long standing RPM build configuration
doesn't work completely reliably any more. This is the first RPM
I've run across with problems (and I don't have a strong need to
change the OpenSSH RPM at the moment), but there are probably others
that are going to blow up on me in the future in some way.
Third and largest, I've clearly wound up in a situation where I'm basically setting up my RPM build environment based on superstition instead of actual knowledge. My settings may have had a sound, well researched basis ten years ago, but that was ten years ago, I haven't kept up with the changes in RPM build practices, and my RPM specfile knowledge has decayed. Using a superstition is sustainable when it's at least a common one, but it also seems like this is very much not and that I'm probably well outside how RPM building and package modification is now done.
Does it matter? I don't know. I could likely stumble on for quite a while before things fell totally apart, because these days I only build or rebuild a small handful of RPMs and they'll probably keep on working (and if it really matters, I have hack workarounds). I don't like knowing that my environment is superstition and partly broken, but I'm also lazy; learning enough to do it the reasonably right way would almost certainly be a bunch of work.
(Modern Fedora RPM building appears to involve special build systems and build environments. This is great if you're submitting a finished source RPM and want a binary package, but it doesn't really make it clear how to do the equivalent of modifying RPMs with quilt, for which you want the unpacked, already patched source in some area where you can look at the source, play around, and so on. I assume that there is some way to do this, and probably some documentation about it somewhere that I'm either overlooking or not understanding.)
PS: The pam_ssh_agent_auth (sub)package seems to be integrated into the OpenSSH source RPM because it uses OpenSSH source code as part of building itself. I think including it in this situation makes sense.
One advantage of 'self-hosted' languages
One of the things that people like to do with languages (and language runtime environments) is to make them 'self-hosted'. A self-hosted language is one where the compiler (or interpreter) is almost entirely written in the language itself, instead of being written in another language such as C.
I don't know all of the reasons that people have for self-hosting languages, since I've never participated in language development. But from an outsider's perspective, I can think of one fairly obvious reason to want to self-host your language, which is that it probably increases the number of people who can work on your compiler by reducing what they need to know.
To work on a language (or its runtime), you generally need to know the language itself, and obviously you need to know the language that the compiler or interpreter or runtime is written in. When language X is written in language Y, this means that you need to know both X and Y. When language X is written in itself, you only need to know X. And if you're interested in working on something involving language X you probably know the language.
(In theory you could imagine situations where people who know only language Y could improve the compiler for language X by working on internals with well-defined semantics, like symbol table handling or the like. In practice I think that the people who will be interested in doing such work in the first place are people who are interested in language X.)
Sidebar: The case of LLVM as the exception that proves the rule
LLVM is an increasingly popular compiler backend written in C++ that is used by a number of languages, for example Rust. Obviously this means that these languages aren't self-hosted and probably will never be; self-hosting would require them to duplicate on their own a significant amount of the time and effort that have been poured into LLVM.
That statement right there is, I think, a big reason why people use LLVM. When you use LLVM as your compiler backend, you get to tap into all of the work that other people have done on it (and will continue to do in the future). You get a free ride on a high quality compiler backend, and for some projects this free ride is definitely worth some narrowing of the pool of contributors to your language's front end.
What makes the LLVM situation work in your favour is that it's a shared backend and so attracts people to work on it who don't care about your language (and who probably don't know anything about it; they can work on the LLVM backend despite this because it has well defined interfaces and APIs). An un-shared compiler or interpreter backend doesn't get this advantage.