Some ways to get (or not get) information about system memory ranges on Linux

July 14, 2021

I recently learned about lsmem, which is described as "list[ing] the ranges of available memory [...]". The source I learned it from was curious why lsmem on a modern 64-bit machine didn't list all of the low 4 GB as a single block (they were exploring kernel memory zones, where the low 4 GB of RAM are still a special 'DMA32' zone). To start with, I'll show typical lsmem default output from a machine with 32 GB of RAM:

; lsmem
RANGE                                  SIZE  STATE REMOVABLE  BLOCK
0x0000000000000000-0x00000000dfffffff  3.5G online       yes   0-27
0x0000000100000000-0x000000081fffffff 28.5G online       yes 32-259

Memory block size:       128M
Total online memory:      32G
Total offline memory:      0B

Lsmem is reporting information from /sys/devices/system/memory (see also memory-hotplug.txt). Both the sysfs hierarchy and lsmem itself apparently come originally from the IBM S390x architecture. Today this sysfs hierarchy apparently only exists for memory hotplug, and there are some signs that kernel developers aren't fond of it.

(Update: I'm wrong about where the sysfs memory hierarchy comes from; see this tweet from Dave Hansen.)

On the machines I've looked at, the hole reported by lsmem is authentic, in that /sys/devices/system/memory also doesn't have any nodes for that range (on the machine above, for blocks 28, 29, 30, and 31). The specific gap varies from machine to machine. However, all of the information from lsmem may well be a simplification of a more complex reality.

The kernel also exposes physical memory range information through /proc in /proc/iomem (on modern kernels you'll probably have to read this as root to get real address ranges). This has a much more complicated view of actual RAM, one with many more holes than what lsmem and /sys/devices/system/memory show. This is especially the case in the low 4G of memory, where for example the system above reports a whole series of chunks of reserved memory, PCI bus address space, ACPI tables and storage, and more. The high memory range is simpler, but still not quite the same:

100000000-81f37ffff : System RAM
81f380000-81fffffff : RAM buffer

The information from /proc/iomem has a lot of information about PCI(e) windows and other things, so you may want to narrow down what you look at. On the system above, /proc/iomem has 107 lines but only nine of them are for 'System RAM', and all but one of them are in the physical memory address range that lsmem lumps into the 'low' 3.5 GB:

00001000-0009d3ff : System RAM
00100000-09e0ffff : System RAM
0a000000-0a1fffff : System RAM
0a20b000-0affffff : System RAM
0b020000-d17bafff : System RAM
d17da000-da66ffff : System RAM
da7e5000-da8eefff : System RAM
dbac7000-ddffffff : System RAM

(I don't have the energy to work out how much actual RAM this represents.)

Another view of physical memory range information is the kernel's report of the BIOS 'e820' memory map, printed during boot. On the system above, this says that the top of memory is actually 0x81f37ffff:

BIOS-e820: [mem 0x0000000100000000-0x000000081f37ffff] usable

I don't know if the Linux kernel exposes this information in /sys. You can also find various other things about physical memory ranges in the kernel's boot messages, but I don't know enough to analyze them.

What's clear is that in general, a modern x86 machine's physical memory ranges are quite complicated. There are historical bits and pieces, ACPI and other data that is in RAM but must be preserved, PCI(e) windows, and other things.

(I assume that there is low level chipset magic to direct reads and writes for RAM to the appropriate bits of RAM, including remapping parts of the DIMMs around so that they can be more or less fully used.)

Comments on this page:

By Dave Hansen at 2021-07-16 00:51:19:

/sys/devices/system/memory wasn't ever really meant to be a canonical listing of all of the RAM on the system. It's really more of an enumeration of the memory regions of which the kernel is managing a part.

Written on 14 July 2021.
« Problems in the way of straightforward device naming in operating systems
Making two Unix permissions mistakes in one »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Jul 14 01:00:13 2021
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.