Some notes on searching the systemd journal with journalctl
Yesterday I wrote about how I like using Grafana Loki for narrow
searches of our logs. In
the process of writing that entry, it occurred to me that systemd's
journalctl
might
have some search features too that I could use as an alternative
to Loki. The answer is that that yes, a modern journalctl does (or
at least probably does, since some of this depends on its build
options).
By now, hopefully everyone knows about using 'journalctl -u
<what>
' to show logs from only a single service,
and also 'journalctl --since ..
', which takes both absolute and
relative times in a convenient syntax (there's also '--until
',
to restrict to a time range, but generally I only use one and just
stop looking after a certain point). If you're fishing for the
systemd unit associated with a log message, you can use 'journalctl
-o with-unit
',
although this won't always show you the answer. If you're using
'with-unit', you may also want '--no-hostname'
so the output is less cluttered.
The big additional option is 'journalctl -g
',
aka --grep, which does what you'd expect; it takes a (Perl-compatible) regular
expression and shows you logs where the message matches the regular
expression. This match can be case sensitive or case insensitive.
In other selection options besides -u, you can get the kernel
messages with 'journalctl -k
',
logs for a particular syslog identifier with 'journalctl -t
',
logs for a particular syslog priority (or message priority) with
'journalctl -p
', and
for a syslog facility with 'journalctl --facility
'.
Conveniently, if you specify a single priority (aka log level), you
get that priority or more important (which is called 'lower' for
reasons to do with how syslog priorities are represented). These
can be combined, so you can write:
journalctl -r --facility daemon -p notice
Journalctl can also match against specific message fields, although
it looks like there's little or no wild card support. If you match
on multiple fields, all fields must match. There are two ways to
find out what fields and field values you have available. First,
you can use 'journalctl -N
' to
get the names of all fields (which are returned in a random, unsorted
order), and then 'journalctl -F ...
' to
see all of the values of a particular field (again, unsorted). The
well known fields and their meanings are covered in
systemd.journal-fields.
Once you have the field and field value of interest, you can then do
eg:
journalctl -r _TRANSPORT=syslog
The other way is to dump some journal entries in JSON format, run them through jq, and see what you get. You can optionally restrict this to certain fields:
journalctl -o json -r | jq . | less journalctl -o json -r | jq '[._CMDLINE, .MESSAGE]' | less
You can use any of the filtering options to cut down how many messages
you have to pick through in JSON format. Unfortunately, 'journalctl
-F ...
' doesn't accept any options to narrow things down, so you can't
do handy things like see what executables are recorded for a particular
service. If you want that, you can do something like:
journalctl -u crond.service --since -31d -o json | jq -r '._EXE' | sort -u
I don't know if the indexing information necessary to determine this is part of the systemd journal index; if it's not, this sort of thing may be the best you can do within systemd.
PS: It's possible to forward the systemd journal using things like systemd-journal-remote and systemd-journal-gatewayd, but it seems much more work to set up, especially if you want some security. We may experiment with this someday (it would go well with our central syslog server), but probably not any time soon.
Sidebar: (re)formatting journalctl output a bit
In theory journalctl allows you to control what fields are printed,
with 'journalctl --output-fields ...
'.
In practice this is not particularly useful for two reasons. First,
you have to use this with a special output format, generally either
'verbose'
or 'cat'. If
you use 'verbose', you get a conveniently formatted timestamp but
also a chunk of forced contents because of the extra fields that
are always included (in an encoded form). If you use 'cat', you get
nothing for free, including formatted timestamps; you need to include
the 'SYSLOG_TIMESTAMP
' field and sort of hope. Second, regardless
of what output format you choose you get each field on a line by
itself, with no option to format them all on one line.
As a result, under most situations I think you're probably better
off using JSON output and then reaching for jq
to reformat things into a useful
text format (see my notes about formatting text with jq
). You can probably use jq
to reformat journalctl's raw timestamps into useful time formats,
too. If you're doing very much with this you're probably going to
wind up putting the whole thing in a script, unless you're much
better at on the fly jq
command lines than I am.
PPS: I can't blame journalctl too much for not providing a general
facility for formatting its output lines. Formatting output is a
potentially complex subject, and journalctl exists in a world with
tools like jq
. In the Unix tradition, it's fine to defer
extensive reformatting to other programs.
(This is where I wish awk would read JSON.)
Comments on this page:
|
|