You probably need to think about how to handle core dumps on modern Linux servers
Once upon a time, life was simple. If and when your programs hit
fatal problems, they generally
dumped core in their current directory under the name
(sometimes you could make them be
core.<PID>). You might or might
not ever notice these core files, and some of the time they might
not get written at all because of various permissions issues (see
the core(5) manpage).
Then complications ensued due to things like Apport, ABRT, and
where an increasing number of Linux distributions have decided to
take advantage of the full power of the
sysctl to capture core dumps themselves.
(The Ubuntu Apport documentation claims that it's disabled by default on 'stable' releases. This does not appear to be true any more.)
In a perfect world, systems like Apport would capture core dumps
from system programs for themselves and arrange that everything
else was handled in the traditional way, by writing a
Unfortunately this is not a perfect world. In this world, systems
like Apport almost always either discard your core files entirely
or hide them away where you need special expertise to find them.
Under many situations this may not be what you want, in which case
you need to think about what you do want and what's the best way
to get it.
I think that your options break down like this:
- If you're only running distribution-provided programs, you can
opt to leave Apport and its kin intact. Intercepting and magically
handling core dumps from standard programs is their bread and butter,
and the result will probably give you the smoothest way to file bug
reports with your distribution. Since you're not running your own
programs, you don't care about how Apport (doesn't) handle core dumps
for non-system programs.
- Disable any such system and set
kernel.core_patternto something useful; I like '
core.%u.%p'. If the system only runs your services, with no users having access to it, you might want to have all core dumps written to some central directory that you monitor; otherwise, you probably want to set it so that core dumps go in the process's current directory.
The drawback of this straightforward approach is that you'll fail to capture core dumps from some processes.
- Set up your own program to capture core dumps and save them
somewhere. The advantage of such a program is that you can capture
core dumps under more circumstances and also that you can immediately
trigger alerting and other things if particular programs or
processes die. You could even identify when you have a core dump
for a system program and pass the core dump on to Apport,
systemd-coredump, or whatever the distribution's native system is.
One drawback of this is that if you're not careful, your core dump handler can hang your system.
If you have general people running things on your servers and those things may run into segfaults and otherwise dump core, it's my view that you probably want to do the middle option of just having them write traditional core files to the current directory. People doing development tend to like having core files for debugging, and this option is likely to be a lot easier than trying to educate everyone on how to extract core dumps from the depths of the system (if this is even possible; it's theoretically possible with systemd at least).
Up until now we've just passively accepted the default of Apport on our Ubuntu 16.04 systems, but now that we're considering what we want to change for Ubuntu 18.04 and I've been reminded of this whole issue by Julia Evans' How to get a core dump for a segfault on Linux (where she ran into the Apport issue), I think we want to change things to the traditional 'write a core file' setup (which is how it was in Ubuntu 14.04).
PS: Since systemd now wants to handle core dumps, I suspect that this is going to be an issue in more and more Linux distributions. Or maybe everyone is going to make sure that that part of systemd doesn't get turned on.