A gotcha when making partial copies of Prometheus's database with rsync
A while back I wrote about how you can sensibly move or copy Prometheus's time series database (TSDB) with rsync. This is how we moved our TSDB, with metrics data back to late 2018, from a mirrored pair of 4 TB HDDs on one server to a mirrored pair of 20 TB HDDs on another one. In that entry I also mentioned that we were hoping to use this technique to keep a partial backup of our TSDB, one that covered the last year or two. It turns out that there is a little gotcha in doing this that makes it trickier than it looks.
The idea way to do such a partial backup was if rsync could exclude
or include files based on their timestamp. Unfortunately, as far as
I know it can't do that. Instead the simple brute force way is to
find to generate a list of what you want to copy and feed that
cd /data/prometheus/metrics2 rsync -a \ $(find * -maxdepth 0 -mtime -365 -type d -print) \ backupserv:/data/prometheus/metrics2/
As covered (more or less) in the Prometheus documentation on local
block directories in your TSDB are frozen after a final 31-day
compaction, and conveniently their final modification time is when
that last 31-day compaction happened. The find with '-maxdepth 0'
filters the command line arguments down to only things a year or
less old; this catches the frozen block directories for the past
year (and a bit), plus the
chunks_head directory of the live
block and the
wal directory of the write-ahead log.
However, it also captures other block directories. Blocks initially cover two hours, but are then compacted down repeatedly until they eventually reach their final 31-day compaction. During this compaction process you'll have a series of intermediate blocks, each of which is a (sub)directory in your TSDB top level directory. Most of these intermediate block directories will be removed over time. Well, they'll be removed over time in your live TSDB; if you replicate your TSDB over to your backupserv the way I have, there's nothing that's going to remove them on your backup. These directories for intermediate blocks will continue to be there in your backup, taking up space and containing duplicate data (which may cause Prometheus to be unhappy with your overall TSDB if you ever have to use this backup copy).
This can also affect you if you repeatedly rsync your entire TSDB without using '--delete'. Fortunately I believe I used 'rsync -a --delete' when moving our TSDB over.
The somewhat simple and relatively obviously correct approach to dealing with this is to send over a list of the directories that should exist to the backup server, and have something on the backup server remove any directories not listed. You'd want to make very sure that you've sent and received the entire list, so that you don't accidentally remove actually desired bits of your backups.
The more tricky approach would be to have rsync do the deletion as part of the transfer. Instead of selectively transferring named directories on the command line, you'd build an rsync filter file that only included directories that were the right age to be transferred, and then use that filter as you transferred the entire TSDB directory with rsync's --delete-excluded argument. This would automatically clean up both 31-day block directories that were now too old and young block directories that had been compacted away.
(You'd still determine the directories to be included with find, but you'd have to do more processing with the result. You could also look for directories that were too old, and set up an rsync filter that excluded them.)
I'm not sure what approach we'll use. I may want to prototype both and see which feels more dangerous. The non-rsync approach feels safer, because I can at least have the remote end audit what it's going to delete for things that are clearly wrong, like deleting a directory that's old enough that it should be a frozen, permanent one.
(Possibly this makes rsync the wrong replication tool for what I'm trying to do here. I don't have much exposure to alternates, though; rsync is so dominant in this niche.)