== How _/proc/slabinfo_ is not quite telling you what it looks like The Linux kernel does a lot (although not all) of its interesting internal memory allocations through a [[slab allocator http://en.wikipedia.org/wiki/Slab_allocation]]. For quite a while it's exposed per-type details of this process in _/proc/slabinfo_; this is very handy to get an idea of just what in your kernel is using up a bunch of memory. Today I was exploring this because I wanted to look into [[ZFS on Linux's memory usage ZFSonLinuxWeakAreas]] and wound up finding out that on modern Linuxes it's [[a little bit misleading https://twitter.com/thatcks/status/519969525601935361]]. (By 'found out' I mean that DeHackEd on the #zfsonlinux IRC channel explained it to me.) Specifically, on modern Linux the names shown in _slabinfo_ are basically a hint because [[the current slab allocator in the kernel http://lwn.net/Articles/229984/]] merges multiple slab types together if they are sufficiently similar. If five different subsystems all want to allocate (different) 128-byte objects with no special properties, they don't each get separate slab types with separate _slabinfo_ entries; instead they are all merged into one slab type and thus one _slabinfo_ entry. That _slabinfo_ entry normally shows the name of one of them, probably the first to be set up, with no direct hint that it also includes the usage of all the others. (The others don't appear in _slabinfo_ at all.) Most of the time this is a perfectly good optimization that cuts down on the number of slab types and enables better memory sharing and reduced fragmentation. But it does mean that you can't tell the memory used by, say, ((btree_node)) apart from ((ip_mrt_cache)) (on my machine, both are one of a lot of slab types that are actually all mapped to the generic 128-byte object). It can also leave you wondering where your slab types actually went, if you're inspecting code that creates a certain slab type but you can't find it in _slabinfo_ (which is what happened to me). The easiest way to see this mapping is to look at _/sys/kernel/slab_; all those symlinks are slab types that may be the same thing. You can decode what is what by hand, but if you're going to do this regularly you should get a copy of _tools/vm/slabinfo.c_ from the kernel source and compile it; see [[the kernel SLUB documentation https://www.kernel.org/doc/Documentation/vm/slub.txt]] for details. You want '_slabinfo -a_' to report the mappings. (Sadly _slabinfo_ is underdocumented. I wish it had a manpage or at least a README.) If you need to track the memory usage of specific slab types, perhaps because you really want to know the memory usage of one subsystem, the easiest way is apparently to boot with the ((slub_nomerge)) [[kernel command line argument KernelCmdlineProcessing]]. Per the [[the kernel parameter documentation https://www.kernel.org/doc/Documentation/kernel-parameters.txt]] this turns off all slab merging, which may result in you having a lot more slabs than usual. (On my workstation, slab merging condenses 110 different slabs into 14 actual slabs. On a random server, 170 slabs turn into 35 and a bunch of the pre-merger slabs are probably completely unused.) === Sidebar: disabling this merging in kernel code The SLUB allocator does not directly expose a way of disabling this merging when you call _``kmem_cache_create()''_ in that there's no 'do not merge, really' flag to the call. However, it turns out that supplying at least one of a number of SLUB debugging flags will disable this merging and on a kernel built without ((CONFIG_DEBUG_KMEMLEAK)) using ((SLAB_NOLEAKTRACE)) appears to have absolutely no other effects from what I can tell. Both Fedora 20 and Ubuntu 14.04 build their kernels without this option. (I believe that most Linux distributions put a copy of the kernel build config in _/boot_ when they install kernels.) This may be handy if you have [[some additional kernel modules http://zfsonlinux.org/]] that you want to be able to track memory use for specifically even though a number of their slabs would normally get merged away, and you're compiling from source and willing to make some little modifications to it. You can see the full set of flags that force never merging in the _#define_ for ((SLUB_NEVER_MERGE)) in _mm/slub.c_. On a quick look, none of the others are either harmless or always defined as a non-zero value. It's possible that ((SLAB_DEBUG_FREE)) also does nothing these days; if used it will make your slabs only mergeable with other slabs that also specify it (which no slabs in the main kernel source do). That would cause slabs from your code to potentially be merged together but they wouldn't merge with anyone else's slabs, so at least you could track your subsystem's memory usage. Disclaimer: these ideas have been at most compile-tested, not run live.