Why it matters that map values are unaddressable in Go
A while ago, I wrote Addressable values in Go (and unaddressable ones too) as an attempt to get straight this tricky concept in Go, which I hadn't fully understood. To refresh, the Go specification's core description of this is covered in Address operators:
For an operand
T, the address operation
&xgenerates a pointer of type
x. The operand must be addressable, that is, either a variable, pointer indirection, or slice indexing operation; or a field selector of an addressable struct operand; or an array indexing operation of an addressable array. As an exception to the addressability requirement,
xmay also be a (possibly parenthesized) composite literal. [...]
One of the things that are explicitly not addressable are values in a map. As I mentioned in the original entry, the following is an error:
On the surface this looks relatively unimportant. There aren't many situations where you might naturally explicitly take the address of a map value. But there turns out to be an important consequence of this, brought to my attention recently by this article.
One important thing in Go that addressability affects is Assignments:
Suppose that you have map values that are structs with fields. Because map values are not addressable and field selectors can only be applied to addressable struct operands, you cannot directly assign values to the fields of map values. The following is an error:
m["key'].field = 10
This will give you the clear error of 'cannot assign to struct field m["key"].field in map'. To make this work, you must assign the map value to temporary variable, modify the temporary, and put it back in the map:
t := m["key"] t.field = 10 m["key"] = t
One reason I can think of for this restriction is that otherwise, Go might be required to silently materialize struct values in maps as a consequence of what looks like a simple field assignment. Consider:
m["nosuchkey"].field = 10
If this was to work, it would have to have the side effect of
creating an entire
m["nosuchkey"] value and setting it in the map
for the key. Instead Go refuses to allow it, at compile time.
In the usual way of addressable values in Go, this will work if the map values are pointers to structs and the syntax is exactly the same. This implies that in some cases you can convert internal map values from pointers to structs to the structs themselves without any code changes or errors, and in some cases you can't. However, if you use pointer map values,
(However, with pointer map values the
would be a runtime panic. When you deal with explicit pointers,
Go makes you accept this possibility.)
This also affects method calls (and method values) in some situations, because of this special case:
xis addressable and
&x's method set contains
x.m()is shorthand for
If you have a type
T and there is a pointer receiver method
*T.Mp(), you can normally call
.Mp() even on a non-pointer
var v T v.Mp()
However, this requires that the value be addressable. Since map
values are not addressable, the following is an error (when the
type of map values is
Currently, you get two errors for this (reported on the same location):
cannot call pointer method on m["key"] cannot take the address of m["key"]
This is the same error message as we saw for function return values
in my original entry, just about a different
thing. As before, converting the map value type from
will make this not an error and all of the syntax is exactly the
As with the field access case, Go not allowing this means that it doesn't have to consider what to do if you write:
While there are various plausible options for what could happen here if Go accepted it, I think the one that most people would expect is that it would work the same as:
t := m["nosuchkey"] t.Mp() m["nosuchkey"] = t
Which is to say, Go would have to materialize a value and then add
it to the map. As a subtle issue, the working version makes it clear
m["nosuchkey"] actually exists. This also makes it explicit
that the method call isn't manipulating the value that is in the
(My original entry was sparked by a Dave Cheney pop quiz involving the type of a function return, so I was thinking more about function return values than other sorts of values.)
PS: I think this lack of map value addressability means that there's no way today in Go to directly modify a map value or its fields. Instead you must copy the map value into a temporary, manipulate the temporary, and then put it back in the map. This is probably a feature.
Apache's mod_wsgi and the Python 2 issue it creates
If you use Apache (as we do) and have relatively casual WSGI-based applications (again, as we do), then Apache's mod_wsgi is often the easiest way to deploy your WSGI application. Speaking as a system administrator, it's quite appealing to not have to manage a separate configuration and a separate daemon (and I still get process separation and different UIDs). But at the moment there is a little problem, at least for people (like us) who use their Unix distribution's provided version of Apache and mod_wsgi rather than build your own. The problem is that any given build of mod_wsgi only supports one version of (C)Python.
(Mod_wsgi contains an embedded CPython interpreter, although generally it's not literally embedded; instead mod_wsgi is linked to the appropriate libpython shared library.)
In the glorious future there will only be (some version of) Python 3, and this will not be an issue. All of your WSGI programs will be Python 3, mod_wsgi will use some version of Python 3, and everything will be relatively harmonious. In the current world, there is still a mixture of Python 2 and Python 3, and if you want to run a WSGI based program written in a different version of Python than your mod_wsgi supports, you will be sad. As a corollary of this, you just can't run both Python 2 and Python 3 WSGI applications under mod_wsgi in a single Apache.
Some distributions have both Python 2 and Python 3 versions of mod_wsgi available; this is the case for Ubuntu 20.04 (which answers something I wondered about last January). This at least lets you pick whether you're going to run Python 2 or Python 3 WSGI applications on any given system. Hopefully no current Unix restricts itself to only a Python 2 mod_wsgi, since there's an increasing number of WSGI frameworks that only run under Python 3.
(For example, Django last supported Python 2 in 1.11 LTS, which is no longer supported; support stopped some time last year.)
PS: Since I just looked it up, CentOS 7 has a Python 3 version of mod_wsgi in EPEL, and Ubuntu 18.04 has a Python 3 version in the standard repositories.
Improving my web reading with Martin Tournoij's "readable" Firefox bookmarklet
Not that long ago, I set up Martin Tournoij's "fixed" bookmarklet
to deal with CSS
When I did this, I also decided to install Tournoij's "readable"
because it was right there and it felt potentially useful. With it
sitting in my tab bar, I started trying it out on sites that I found
not so readable, or even vaguely marginally non-readable, and to my
surprise it's been a major quality of life improvement on many sites.
I've become quite glad that I made it conveniently available.
What the "readable" bookmarklet does is it goes through every <p>, <li>, and <div> to force the text colour, size, weight, line spacing, and font family to reasonable values. It doesn't try to set the background colour, but it turns out that a lot of sites use a basically white background, so forcing the text colour is sufficient. All of this sounds very basic, but the result can be really marvelous. It's especially impressive on sites that don't feel as if they have obviously terrible text, just text that's a bit annoying. It turns out that what feels 'a bit annoying' to me is often harder to read than I was consciously aware of.
Why such simple restyling works so well in practice is somewhat sad. It turns out that a lot of sites make terrible text styling choices for clear readability. The obvious case is too-small text, but beyond that a lot of sites turn out to set a lower-contrast text colour, such as some shade of grey, unusually thin text through either weight or font choice, or both at once. Undoubtedly they think that the result looks good and is perfectly readable, but increasingly my eyes disagree with them.
n.style.color is simple; #000 is black. The
is a little bit more complex, because it's using the shorthand
in a specific format. This format sets the font-weight to
'500', which is just a little bit bolder than normal ('400' is
normal), the font-size to 16
px (which these days is a device-independent thing), the line-height to
1.7 em for a pretty generous spacing between lines, and the
your general sans-serif font. People who prefer serif fonts may
want to change that to '
serif', and in general now that I look
at it you might want to tinker with the 16px and the line spacing
as well, depending on your preferences.
(My standard Firefox font is set to the default Fedora 'serif' font, currently DejaVu Serif according to Firefox, at size '15'. I could probably reasonably change the '16px/1.7em sans-serif' in the bookmarklet to '15px/1.5em serif' or so, but at the moment I don't feel inclined to do so; if I'm irritated enough to poke the bookmarklet, I might as well make the page really readable.)
It's nice when programs switch to being launched from systemd user units
I recently upgraded my home machine from Fedora 33 to Fedora 34. One of the changes in Fedora 34 is that the audio system switched from PulseAudio to PipeWire (the Fedora change proposal, an article on the switch). Part of this switch is that you need to run different daemons in your user session. For normal people, this is transparently handled by whichever standard desktop environment they're using. Unfortunately I use a completely custom desktop, so I have to sort this out myself (this is one way Fedora upgrades are complicated for me). Except this time I didn't need to do anything; PipeWire just worked after the switch.
One significant reason for this is that PipeWire arranges to be
started in your user session not through old mechanisms like
/etc/xdg/autostart but through a systemd
user unit (actually
two, one for the daemon and one for the socket). Systemd user units
are independent of your desktop and get started automatically, which
means that they just work even in non-standard desktop environments
(well, so far).
(As covered in the Arch Wiki, there are some things you need to do in an X session.)
One of the things that's quietly making my life easier in my custom desktop environment is that more things are switching to being started through systemd user units instead of the various other methods. It's probably a bit more work for some of the programs involved (since they can't assume direct access to your display any more and so on), but it's handy for me, so I'm glad that they're investing in the change.
PS: It turns out that the basic PulseAudio daemon was also being
set up through systemd user units on Fedora 33. But PulseAudio
did want special setup under X, with an
file that ran
/usr/bin/start-pulseaudio-x11. It's possible that
PipeWire is less integrated with the X server than PulseAudio is.
See the PulseAudio X11 modules
PPS: Apparently I now need to find a replacement for running '
-q set Master ...' to control my volume from the keyboard. This apparently still works for some people
but not for me; for now '
pactl' does, and it may be the more or
less official tool for doing this with PipeWire for the moment, even
though it's from PulseAudio.
Making a Go program build with Go modules can be not a small change
In theory, at some point in the future Go will stop supporting
the traditional GOPATH mode. When this happens,
if you want to still build old Go programs that you have sitting
around in checked out version control repositories, you will need
to modularize them. Once upon a time, I thought that this would be
as simple as going to the root of your copy of the repo, then running
go mod init ...' and '
go mod tidy'. Unfortunately, life is not
this simple and there can be at least two complications.
The first complication is moved and renamed repositories for modules,
if the moved module has a
go.mod that declares its new name. For
example what is now github.com/hexops/vecty was once github.com/gopherjs/vecty. In a non-modular Go build, you
can still import it under the old path and it will work. However,
the moment you attempt to modularize the program, '
go mod tidy'
will complain and stop:
github.com/gopherjs/vecty: email@example.com: parsing go.mod: module declares its path as: github.com/hexops/vecty but was required as: github.com/gopherjs/vecty
In theory you may be able to get this to work with a go.mod
In practice my attempts to do this resulted in '
go mod tidy' errors
go: firstname.lastname@example.org used for two different module paths (github.com/gopherjs/vecty and github.com/hexops/vecty)
(You also need to get the version number or other version identifier of the moved repository.)
The general fix is to edit every import of packages from the module
to use the new location. Then you can run '
go mod tidy' without it
The second complication is modules that have moved to versions above
v1, possibly very far past v1; for example, github.com/google/go-github is up to v37, and modularized
at v18 (it doesn't even have a tagged v1). A GOPATH build of the
program you're trying to modularize will use whatever version of
the repository you have checked out, which may well be the current
one, and the code will import it as a version without a version
suffix (as '
github.com/google/go-github'). When you run '
tidy', Go will attempt to find the most recent tag (or version of
the repository) that doesn't have a
go.mod file, and specify that
version in your
go.mod with a '
+incompatible' tag. Depending on
how far Go had to rewind, this may be a version of the package that
is far older than the program expects.
go.mod existed for a v1 version, I suspect that '
tidy' will pick that in this case. But I haven't tried to test
it, partly for lack of a suitable module to test against. With
github.com/google/go-github, I get 'v17.0.0+incompatible',
the last tagged version before it was modularized.)
Again the fix is to edit the program's source code to change every
import of the package to use the proper versioned package. Instead
of importing, say, '
github.com/google/go-github/github', you would
(There may be other tools to do this package import renaming, but this is the one I could find.)
The unfortunate part of all of this is that it requires you to make changes to files that will be under version control in the repo. If the upstream updates things in the future, this will probably make your life more complicated.
(In some cases, '
go mod tidy' may insist that you clean up imports
in code that's in sub-packages in the repository that aren't actually
imported and used in the program itself.)
On sending all syslog messages to one file
Over on Twitter, I had a view on where syslog messages should go:
Tired sysadmin take: Different sorts of syslog messages going to different places are a mistake. Throw it all into /var/log/allmessages and I'll sort it out myself.
Like many Twitter takes of mine, in retrospect this one is heartfelt but a little bit too extreme as presented. Specifically, I think you should log all syslog messages to one place, but also log some sorts of messages to their own additional places so you can look through them more easily.
In the old days, I used to carefully curate my syslog.conf so that
every different syslog facility had its own different file. Often,
the net result of this is that I would end up using
grep on every
current syslog file in
/var/log because I'd forgotten (or never
knew) what facility a given program logged under. Trying to predict
what facility a program will use is often almost as futile as
predicting what priority level messages will be logged under.
(This is worse if you rely on the Unix vendor stock syslog.conf instead of customizing it. Unix vendors are inevitably different from each other, and some of them have rather strange ideas of what should go where.)
All of this leads to the tired sysadmin take of putting everything
into one file (
/var/log/allmessages is what I prefer) and then
searching it. An
allmessages file is the brute force solution to
unpredictable programs and Unix vendor variability, and it also
makes sure everything gets logged.
But sending all syslog messages to only a single place is a little
bit of overkill. Despite my tired take, there are often syslog
facilities that it's sensible to also log to separate files, so you
can look at just them.
The obvious case is kernel messages, and it's so obvious that
journalctl has a dedicated flag to show you only kernel
messages. If I was starting a syslog configuration from scratch, I
would also have a log file dedicated to "auth" and "authpriv"
messages, one dedicated to "mail" messages, and on my own systems,
one dedicated to "daemon" messages. Everything would still go to
allmessages; these files are in addition to it.
(And on some systems you might opt to have specific programs log to specific facilities, like "user" or "local0", and have specific files so you can monitor and see the activities of just those programs.)
Sending all syslog messages to an
allmessage file is a blunt
hammer, and like all blunt hammers it's possible to overuse it.
Being able to scan through a single file that has everything has a
lot of positive features, but not everything is best served by
searching for it through a giant file. Sometimes you want both
The minimum for syslog configurations should be to log (nearly) everything
I have some opinions on how the venerable Unix syslog should be set
up, but a very strong one of them is that (nearly) every syslog
message should be logged somewhere. I consider this a minimum
standard for vendor and distribution supplied
The 'nearly' is that although syslog priorities don't mean much
these days, I think a Unix is reasonably
justified in not syslog'ing the
debug priority for most facilities.
However, a stock
syslog.conf should definitely log each of the
syslog facilities supported by its syslog to somewhere.
This should also be something you preserve in any local versions or modifications to the standard syslog configuration. Unless you're extremely sure that a syslog facility will never ever be used, you should keep logging it somewhere. And if you're sure it will never be used, well, what's the harm in having it sent to a file that will always wind up being empty? This is especially the case if you're running third party software (whether commercial or open source), because programmers can have all sorts of clever ideas about what syslog facilities to use for what.
If you're extremely sure that you don't need to syslog a particular facility and so you leave it out, please put a comment in your syslog configuration file to explain this. A good goal to strive for in syslog configuration files (for you and for vendors) is to create one that convinces any sysadmin reading it (including your future self) that it covers everything that will ever be logged.
(My other syslog configuration opinions are for another entry.)
PS: Out of the Unixes we use, Ubuntu has a default configuration that clearly logs everything to either /var/log/syslog or /var/log/auth.log, while the stock OpenBSD configuration only covers a limited number of facilities. It's possible that OpenBSD covers every use of syslog in the base system (you'd certainly hope so), but if so I doubt it covers all uses of syslog in the packages collection.
The WireGuard VPN challenge of provisioning clients
I mentioned in yesterday's entry that at work I'm building a VPN server that will support WireGuard. I'm quite happy with WireGuard in general and I think it has some important attractive features (such as the lack of 'sessions'), but we won't be offering WireGuard for general use. I would like to, but every time I even consider the idea, I run headlong into the problem of provisioning, specifically of provisioning WireGuard clients in some way that ordinary people can successfully set them up.
Right now, to set up a WireGuard client you need the server's name
and port (which every VPN needs), the server's public key, the IP
the server expects you to have inside the WireGuard connection (its
AllowedIPs setting for you), and a private key that the server
has the public key for. We also
need you to set your DNS server(s) to correctly point to us, and
for general VPN usage you have to set your
AllowedIPs to 0.0.0.0/0.
This is a lot more things for you to set up than other VPN servers
need, partly because other VPN servers will push your internal IP,
the DNS servers to use, and often other information to you. Much
of this is also sensitive to typos or, in the case of keys, must
be cut and pasted to start with (no one is typing a base64 WireGuard
key). If you get your client IP wrong, for example, things just
quietly don't work (the server will discard your traffic).
The client keypair is an especially touchy problem. The ideal would
be to securely generate it on the client and upload the public key.
In practice this is asking a lot of people to do more or less by
hand, so in a realistic setup we would probably want to generate
your client keypair on the server and then somehow give you access
to the private key for you to configure along side the server's
public key. Given this, possibly the most generally usable way of
provisioning WireGuard client connections would be to generate the
wg.conf that a client would use with the normal WireGuard command
line tools, then provide it to people and hope that any WireGuard
client will be able to import it.
(The official WireGuard client for iOS and Android will apparently
do this, including decoding the configuration from a QR code. I
believe the official Windows client does as well. On Unix, you can
wg.conf directly or import it into NetworkManager.)
An additional complication is that you need a separate WireGuard configuration on each device that you want to use WireGuard on at the same time. So we wouldn't have to just provision one WireGuard setup per person, we're looking at one for your laptop, one for your phone, one for your tablet, and so on. This also complicates naming them and keeping track of them (for people and for us), and likely would tempt people into reusing configurations across devices, which leads to fun problems if both devices are in use at the same time.
I don't blame the WireGuard project for this state of affairs. Provisioning is both a hard problem and a high level concern that is sort of out of scope for a project that's deliberately low level and simple. I'm honestly impressed (and happy) that there are official WireGuard clients on as many platforms as there are. I do wish there was some officially supported way to push configuration information to clients, although I understand why there isn't.
(Tailscale is not a solution for us for various reasons, including price. I do admire them for solving the provisioning problem, though.)
Setting up a WireGuard client with NetworkManager (using
For reasons beyond the scope of this entry, I've been building a VPN server that will support WireGuard (along with OpenVPN and L2TP). A server needs a client, so I spent part of today setting up my work laptop as a WireGuard client in a 'VPN' configuration, under NetworkManager because that's what my laptop uses. I was hoping to do this through the Cinnamon GUIs for NetworkManager, but unfortunately while NetworkManager itself has supported WireGuard for some time, this support hasn't propagated into GUIs such as the GNOME Control Center (cf) or the NetworkManager applet that Cinnamon uses.
I'm already quite familiar with WireGuard in general, so I found
that the easiest way to start was to set up a basic WireGuard
configuration file for the connection in
including both the main configuration (with the laptop's key and
my local port) and a
[Peer] section for the server. Since I'm
using WireGuard here in a VPN configuration, instead of to reach
just some internal IPs, I set
to 0.0.0.0/0. After writing
wg0.conf, I then imported it into
nmcli connection import type wireguard file /etc/wireguard/wg0.conf
(For what can go in the configuration file, start with
wg-quick(8). I suspect
that NetworkManager doesn't support some of the more advanced keys.
I stuck to the basics. The import process definitely ignores the
various script settings supported by
nm_vpn_wireguard_import() in nm-vpn-helpers.c.)
Imported connections are apparently set to auto-connect, which isn't what I wanted, plus there were some other things to adjust (following the guide of Thomas Haller's WireGuard in NetworkManager):
nmcli con modify wg0 \ autoconnect no \ ipv4.method manual \ ipv4.address 172.29.50.10/24 \ ipv4.dns <...>
At this point you might be tempted to set
ipv4.gateway, and indeed
that's what I did the first time around. It turns out that this is
a mistake, because these days NetworkManager will do the right thing
based on the 'accept everything'
AllowedIPs I set, right down to
setting up policy based routing with a fwmark so that encrypted
traffic to the WireGuard VPN server doesn't try to go over WireGuard.
If you set
ipv4.gateway as well, you wind up with two default
routes and then your encrypted WireGuard traffic may try to go over
your WireGuard connection again, which doesn't work.
(See the description of '
ip4-auto-default-route in the WireGuard
The full index of available NetworkManager settings in various
sections is currently here; the
ones most useful to me are probably
Getting DNS to work correctly requires a little extra step, or at
least did for me. While the
wg0 connection is active, I want all
of my DNS queries to go to our internal resolving DNS server and
also to have a search path of our university subdomain. This
apparently requires explicitly including '
~' in the NetworkManager
DNS search path:
nmcli con modify wg0 \ ipv4.dns-search "cs.toronto.edu,~"
You (I) can see a lot of settings for the WireGuard setup with
nmcli connection show wg0', including active ones, but this seems
to omit NetworkManager's view of the WireGuard peers. To see that,
I needed to look directly at the configuration file that NetworkManager
I'm someday going to need to edit this directly to modify the
WireGuard VPN server's endpoint from my test machine to the production
(The NetworkManager RFE for configuring WireGuard peers in
is issue #358.)
With no GUI support for WireGuard connections, I have to bring this
WireGuard VPN up and down with '
nmcli con up wg0' and '
down wg0'. Once I have the new VPN server in production, I'll be
writing little scripts to do this for me. Hopefully this will be
improved some day, so that the NetworkManager applet allows you to
activate and deactivate WireGuard connections and shows you that
one is active.
If I wanted a limited VPN that only sent traffic to our internal
networks over my WireGuard link, I would configure the server's
AllowedIPs to the list of networks and then I believe that
NetworkManager would automatically set up routes for them. However,
I don't know how to make this work (in NetworkManager) if the
WireGuard VPN server itself was on one of the subnets I wanted to
reach over WireGuard. For my laptop, routing all traffic over
WireGuard to work is no worse than using our OpenVPN or L2TP VPN
servers, which also do the same thing by default.
(On my home desktop, I use hand built fwmark-based policy rules to deal with my WireGuard endpoint being on a subnet I want to normally reach over WireGuard. NetworkManager will build the equivalents for me when I'm routing 0.0.0.0/0 over the WireGuard link, but I believe not in other situations.)
Making two Unix permissions mistakes in one
Today's state of work-brain:
umask 077 /tmp/fred
Immediately after these two commands, I hit cursor-up to change the
umask' to '
chmod', so that I then ran '
chmod 077 /tmp/fred'.
Fortunately I was doing this as a regular user, so my next action
exposed my error.
This whole sequence of commands is a set of mistakes jumbled together
in a very Unix way. My goal was to create a new
that was only accessible to me. My second command is not just wrong
because I wanted
chmod instead of
umask (I should have run
umask before the
mkdir, not after), but because I had the wrong
set of permissions for
chmod. It was as if my brain wanted Unix
to apply a '
umask 077' to the creation of
/tmp/fred after the
fact. Since the numeric permissions you give to
umask are the
inverse of the permissions you give to
chmod (you tell
what you don't want instead of what you do), my change of
chmod then left
/tmp/fred with completely wrong permissions;
instead of being only accessible to me, it was fully accessible to
everyone except me.
(Had I been doing this as root, I would then have been able to
into the directory, put files in it, access files in it, and so on,
and might not have noticed that the permissions were reversed from
what I actually wanted.)
The traditional Unix
umask itself is a very Unix command (well,
shell built-in), in that it more or less directly calls
This allows a very simple implementation, which was a priority in
early Unixes like V7. A more sensible implementation would be that
you specify effectively the maximum permissions that you want (for
example, that things can be '755') and then
umask would invert
this to get the value it uses for
umask(). But early Unixes took
the direct approach, counting on people to remember the inversion
and perform it in their heads.
In the process of writing this entry I learned that POSIX
supports symbolic modes, and that they work this way. You get and
set umask modes like '
u=rwx,g=rx,o=rx' (aka '022', the traditional
friendly Unix umask), and they're the same permissions as you would
chmod. I believe that this symbolic mode is supported
by any modern Bourne compatible shell (including
zsh), but it
isn't necessarily supported by non-Bourne shells such as
rc (which is my shell).
Some ways to get (or not get) information about system memory ranges on Linux
I recently learned about
lsmem, which is
described as "list[ing] the ranges of available memory [...]". The
source I learned it from was curious why
lsmem on a modern 64-bit
machine didn't list all of the low 4 GB as a single block (they
were exploring kernel memory zones, where the
low 4 GB of RAM are still a special 'DMA32' zone). To start with,
I'll show typical
lsmem default output from a machine with 32 GB
; lsmem RANGE SIZE STATE REMOVABLE BLOCK 0x0000000000000000-0x00000000dfffffff 3.5G online yes 0-27 0x0000000100000000-0x000000081fffffff 28.5G online yes 32-259 Memory block size: 128M Total online memory: 32G Total offline memory: 0B
Lsmem is reporting information from
Both the sysfs hierarchy and lsmem itself apparently come originally
from the IBM S390x architecture. Today this sysfs hierarchy
apparently only exists for memory hotplug, and there
are some signs that kernel developers aren't fond of it.
On the machines I've looked at, the hole reported by
authentic, in that
/sys/devices/system/memory also doesn't have
any nodes for that range (on the machine above, for blocks 28, 29,
30, and 31). The specific gap varies from machine to machine.
However, all of the information from
lsmem may well be a
simplification of a more complex reality.
The kernel also exposes physical memory range information through
/proc/iomem (on modern kernels you'll probably have to read this
as root to get real address ranges). This has a much more complicated
view of actual RAM, one with many more holes than what
/sys/devices/system/memory show. This is especially the case in
the low 4G of memory, where for example the system above reports a
whole series of chunks of reserved memory, PCI bus address space,
ACPI tables and storage, and more. The high memory range is simpler,
but still not quite the same:
100000000-81f37ffff : System RAM 81f380000-81fffffff : RAM buffer
The information from
/proc/iomem has a lot of information about
PCI(e) windows and other things, so you may want to narrow down
what you look at. On the system above,
/proc/iomem has 107 lines
but only nine of them are for 'System RAM', and all but one of them
are in the physical memory address range that
lsmem lumps into
the 'low' 3.5 GB:
00001000-0009d3ff : System RAM 00100000-09e0ffff : System RAM 0a000000-0a1fffff : System RAM 0a20b000-0affffff : System RAM 0b020000-d17bafff : System RAM d17da000-da66ffff : System RAM da7e5000-da8eefff : System RAM dbac7000-ddffffff : System RAM
(I don't have the energy to work out how much actual RAM this represents.)
Another view of physical memory range information is the kernel's report of the BIOS 'e820' memory map, printed during boot. On the system above, this says that the top of memory is actually 0x81f37ffff:
BIOS-e820: [mem 0x0000000100000000-0x000000081f37ffff] usable
I don't know if the Linux kernel exposes this information in
You can also find various other things about physical memory ranges
in the kernel's boot messages, but I don't know enough to analyze them.
What's clear is that in general, a modern x86 machine's physical memory ranges are quite complicated. There are historical bits and pieces, ACPI and other data that is in RAM but must be preserved, PCI(e) windows, and other things.
(I assume that there is low level chipset magic to direct reads and writes for RAM to the appropriate bits of RAM, including remapping parts of the DIMMs around so that they can be more or less fully used.)