The practical (Unix) problems with .cache and its friends

February 4, 2025

Over on the Fediverse, I said:

Dear everyone writing Unix programs that cache things in dot-directories (.cache, .local, etc): please don't. Create a non-dot directory for it. Because all of your giant cache (sub)directories are functionally invisible to many people using your programs, who wind up not understanding where their disk space has gone because almost nothing tells them about .cache, .local, and so on.

A corollary: if you're making a disk space usage tool, it should explicitly show ~/.cache, ~/.local, etc.

If you haven't noticed, there are an ever increasing number of programs that will cache a bunch of data, sometimes a very large amount of it, in various dot-directories in people's home directories. If you're lucky, these programs put their cache somewhere under ~/.cache; if you're semi-lucky, they use ~/.local, and if you're not lucky they invent their own directory, like ~/.cargo (used by Rust's standard build tool because it wants to be special). It's my view that this is a mistake and that everyone should put their big caches in a clearly visible directory or directory hierarchy, one that people can actually find in practice.

I will freely admit that we are in a somewhat unusual environment where we have shared fileservers, a now very atypical general multi-user environment, a compute cluster, and a bunch of people who are doing various sorts of modern GPU-based 'AI' research and learning (both AI datasets and AI software packages can get very big). In our environment, with our graduate students, it's routine for people to wind up with tens or even hundreds of GBytes of disk space used up for caches that they don't even realize are there because they don't show up in conventional ways to look for space usage.

As noted by Haelwenn /элвэн/, a plain 'du' will find such dotfiles. The problem is that plain 'du' is more or less useless for most people; to really take advantage of it, you have to know the right trick (not just the -h argument but feeding it to sort to find things). How I think most people use 'du' to find space hogs is they start in their home directory with 'du -s *' (or maybe 'du -hs *') and then they look at whatever big things show up. This will completely miss things in dot-directories in normal usage. And on Linux desktops, I believe that common GUI file browsers will omit dot-directories by default and may not even have a particularly accessible option to change that (this is certainly the behavior of Cinnamon's 'Files' application and I can't imagine that GNOME is different, considering their attitude).

(I'm not sure what our graduate students use to try explore their disk usage, but I know that multiple graduate students have been unable to find space being eaten up in dot-directories and surprised that their home directory was using so much.)


Comments on this page:

Dear everyone writing Unix programs that cache things in dot-directories (.cache, .local, etc): please don't.

Those defaults are required by the XDG base directory specification, so your suggestion will produce complaints. But you could put something like this in your shell startup script (and the default startup script for your students):

export XDG_CACHE_HOME="$HOME/cache"

I don't know if there's an equivalent for stuff like .cargo, though it should probably be using $XDG_CACHE_HOME. It'd be easy for them to check for and use .cargo if present, or use the XDG location if not.

For annoying software, symbolic links to more visible locations are a possibility. For really annoying software, I use bwrap to manipulate its view of the filesystem. Often, the caching is so pointless or undesirable (I generally don't want my system keeping records of what I do) that I'll use bwrap's --tmpfs option for the cache. The XDG rules don't require anyone to ever clear their cache, or to be prepared for an external process to do it.

By deltragon at 2025-02-05 04:38:49:

For Cargo, there's currently discussions to also follow the XDG spec, ie. use ~/.cache. However, it's being slowed by backwards compatibility and having to coordinate between multiple tools.

By M S at 2025-02-08 12:29:46:

One of my favorite cli utilities for this is ncdu:

https://dev.yorhel.nl/ncdu

By Aram Akhavan at 2025-02-09 23:58:30:

+1 for ncdu. We rolled it out and not only does it seem to run faster than vanilla du, (I would imagine they use the same number of syscalls but maybe it's just the perception from having a progress bar), but it's also so much easier to find and delete files.

Written on 04 February 2025.
« Why writes to disk generally wind up in your OS's disk read cache
How Ubuntu 24.04's bad bpftrace package appears to have happened »

Page tools: View Source, View Normal.
Search:
Login: Password:

Last modified: Tue Feb 4 22:53:04 2025
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.