When Promtail seems to make position checkpoints (as of v2.6.1)

October 11, 2022

Promtail is the normal log-shipping client for the Grafana Loki log aggregation system. Like all log shipping programs, Promtail needs to keep track of what logs it has and hasn't sent to Loki, which it does by keeping track of its positions in each log file or log source. In order to handle being stopped and restarted, and also system crashes (or Promtail crashes), it normally saves these positions in a file. Exactly what a position is depends on the specific log source that Promtail is using. When the log source is a file, Promtail only uses and saves a byte offset, but when Promtail is reading from the systemd journal, it uses the journal cursor. Every entry in the systemd journal has a unique identifier, the cursor, and so if the journal hasn't been truncated you can always resume reading from that entry by using its cursor.

Promtail pushes logs to Loki (instead of Loki pulling logs from it). If your Loki is down for some reason, ranging from a crash to scheduled maintenance to a large scheduled power outage, Promtail will buffer log entries and retry for some amount of time, as set up in the 'backoff_config' subsection of the clients: section of the configuration file. If the time expires, the fine configuration file documentation says 'logs are lost'. However, at this point you might wonder how Promtail's positions interact with this, especially for the systemd journal. In theory, Promtail could only record the position (ie, the systemd journal cursor) of the last successfully shipped log line, and then resume more or less seamlessly even after a long Loki downtime. The code for this sort of resuming already exists, because it's what's used if you shut Promtail down for a few hours and then start it again.

The unfortunate current answer is that Promtail checkpoints the position when it reads log lines from the log source, not when it has successfully sent them to Loki. As far as I can tell from some experiments, this happens even if Promtail has existing lines buffered that it has failed to ship; Promtail will keep reading from the systemd journal and keep updating the cursor position it will 'resume' from. If your Loki is down long enough, Promtail will then unnecessarily lose systemd journal log entries.

The consequence of this is that if you're deliberately shutting down your Loki server for longer than your Promtail retry interval, you want to first shut down Promtail on all clients. When you shut down Promtail it will freeze its systemd journal position at what it actually managed to ship, and then later when your Loki is back up, you can restart all your Promtails and collect all of the missed journal entries.

(This was an issue for us not too long ago because we had a planned machine room AC outage where we wanted to keep a minimum number of machines up through, and the Loki server did not make the cut.)

In an ideal world Promtail will fix this sometime. In this world, I suspect that Promtail's internal architecture probably makes this complex and this particular situation is a rare corner case.

(The systemd journal is the best case for saving a position that will be reliable over the long term. This is harder with actual log files, which may be rotated or truncated before Promtail has a chance to resume reading them.)

Written on 11 October 2022.
« The pragmatic effects of setting nconnect on NFS v3 mounts on Linux
We are stuck with egrep and fgrep (unless you like beating people) »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Tue Oct 11 21:47:45 2022
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.