The practical (Unix) problems with .cache and its friends

February 4, 2025

Over on the Fediverse, I said:

Dear everyone writing Unix programs that cache things in dot-directories (.cache, .local, etc): please don't. Create a non-dot directory for it. Because all of your giant cache (sub)directories are functionally invisible to many people using your programs, who wind up not understanding where their disk space has gone because almost nothing tells them about .cache, .local, and so on.

A corollary: if you're making a disk space usage tool, it should explicitly show ~/.cache, ~/.local, etc.

If you haven't noticed, there are an ever increasing number of programs that will cache a bunch of data, sometimes a very large amount of it, in various dot-directories in people's home directories. If you're lucky, these programs put their cache somewhere under ~/.cache; if you're semi-lucky, they use ~/.local, and if you're not lucky they invent their own directory, like ~/.cargo (used by Rust's standard build tool because it wants to be special). It's my view that this is a mistake and that everyone should put their big caches in a clearly visible directory or directory hierarchy, one that people can actually find in practice.

I will freely admit that we are in a somewhat unusual environment where we have shared fileservers, a now very atypical general multi-user environment, a compute cluster, and a bunch of people who are doing various sorts of modern GPU-based 'AI' research and learning (both AI datasets and AI software packages can get very big). In our environment, with our graduate students, it's routine for people to wind up with tens or even hundreds of GBytes of disk space used up for caches that they don't even realize are there because they don't show up in conventional ways to look for space usage.

As noted by Haelwenn /элвэн/, a plain 'du' will find such dotfiles. The problem is that plain 'du' is more or less useless for most people; to really take advantage of it, you have to know the right trick (not just the -h argument but feeding it to sort to find things). How I think most people use 'du' to find space hogs is they start in their home directory with 'du -s *' (or maybe 'du -hs *') and then they look at whatever big things show up. This will completely miss things in dot-directories in normal usage. And on Linux desktops, I believe that common GUI file browsers will omit dot-directories by default and may not even have a particularly accessible option to change that (this is certainly the behavior of Cinnamon's 'Files' application and I can't imagine that GNOME is different, considering their attitude).

(I'm not sure what our graduate students use to try explore their disk usage, but I know that multiple graduate students have been unable to find space being eaten up in dot-directories and surprised that their home directory was using so much.)

Written on 04 February 2025.
« Why writes to disk generally wind up in your OS's disk read cache
How Ubuntu 24.04's bad bpftrace package appears to have happened »

Page tools: View Source.
Search:
Login: Password:

Last modified: Tue Feb 4 22:53:04 2025
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.