The problem with self-contained 'application bundle' style packaging

August 10, 2014

In a comment on my entry on FreeBSD vs Linux for me, Matt Campell asked (quoting me in an earlier comment):

I also fundamentally disagree with an approach of distributing applications as giant self contained bundles.

Why? Mac OS X, iOS, and Android all use self-contained app bundles, and so do the smarter third-party developers on Windows. It's a proven approach for packaged applications.

To answer this I need to add an important bit of context that may not have been clear in my initial comment and certainly isn't here in this extract: I was talking about PC-BSD in specific and in general the idea that the OS provider would distribute their packages this way.

Let's start with a question. Suppose that you start with a competently done .deb or RPM of Firefox and then convert it into one of these 'application bundles' instead. What's the difference between the contents of the two packagings of Firefox? Clearly it is that some of Firefox's dependencies are going to be included in the application bundle, not just Firefox itself. So what dependencies are included, or to put it another way, how far down the library stack do you go? GTK and FreeType? SQLite? The C++ ABI support libraries? The core C library?

The first problem with including some or all of these dependencies is that they are shared ones; plenty of other packages use them too. If you include separate copies in every package that uses them, you're going to have a lot of duplicate copies floating around your system (both on disk and in memory). I know disk and RAM are both theoretically cheap these days, but yes this still matters. In addition, packaging copies of things like GTK runs into problems with stuff that was designed to be shared, like themes.

(A sufficiently clever system can get around the duplication issue, but it has to be really clever behind the backs of these apparently self contained application bundles. Really clever systems are complex and often fragile.)

The bigger problem is that the capabilities enabled by bundling dependencies will in practice essentially never be used for packages supported by the OS vendor. Sure, in theory you could ship a different minor version of GTK or FreeType with Firefox than with Thunderbird, but in practice no sane release engineering team or security team will let things go out the door that way because if they do they're on the hook for supporting and patching both minor versions. In practice every OS-built application bundle will use exactly the same minor version of GTK, FreeType, the C++ ABI support libraries, SQLite, and so on. And if a dependency has to get patched because of one application, expect new revisions of all applications.

(In fact pretty much every source of variation in dependencies is a bad idea at the OS vendor level. Different compile options for different applications? Custom per-application patches? No, no, no, because all of them drive up the support load.)

So why is this approach so popular in Mac OS X, iOS, Windows, and so on? Because it's not being used by the OS vendor. Creators of individual applications have a completely different perspective, since they're only on the hook to support their own application. If all you support is Firefox, there is no extra cost to you if Thunderbird or Liferea is using a different GTK minor version because updating it is not your responsibility. In fact having your own version of GTK is an advantage because you can't have support costs imposed on you because someone else decided to update GTK.


Comments on this page:

By Ewen McNeill at 2014-08-11 01:48:58:

In addition in the case of (binary) "application bundles" (provided by third party OEMs) the "application bundle" may be the only sane way that they can ensure that a given instance has the versions of the libraries that they have tested (quality controlled) their application with. They may well be targeting a variety of base operating systems which have had a variety of (OS vendor) patches installed. So it's actually done to reduce support costs (by minimising the "dependent ABI" surface done to just "common across all OS versions supported" requirements). Much as I wish it weren't the case, for third party OEMs, with no ability to change or dictate the base OS, distributing binary only software, that needs to work seemlessly across multiple base OS versions it may well be the only sane choice. Especially since all of these platforms do not have a sane globally-installed third-party-usable method for installing packages and specifying their dependencies on particular versions of those other packages.

Anywhere else, like you it appears, I'd much rather see a minimally bundled application packaged combined with a set of machine-readable dependencies on other minimally bundled packages -- on down to the OS vendor base set. Such as is done in Debian, etc. Because it ensures that everything is built against a common set of base dependencies, which have been verified to work together. And it ensures that, eg, a security update only needs to be applied in a few obvious places. This is the "system integration" part of an OS distribution. Anything else feels more like a "Docker"/"container" approach, rather than an application package prepared by the OS distributor.

Ewen

FWIW, the duplication you speak of reminds me of the arguments surrounding static vs. dynamic libraries (including libc!).

Written on 10 August 2014.
« What I want out of a Linux SSD disk cache layer
Copying GPT partition tables from disk to disk »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Aug 10 23:32:03 2014
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.