Wandering Thoughts


A revealing Vim addressing mistake that I made today

Today I wound up typing and trying to use the following Vim command (more or less) with a straight face:


Vim warned and prompted me by asking 'Backwards range given, OK to swap (y/n)?'. In an insufficiency of something (maybe coffee), I told it yes, and was then confused by the results.

What I had was a file that intermixed a bunch of copies of a header line (although each instance had a prefix that differed) and a bunch of data lines. I wanted to get rid of all but the first instance of the header line and keep all of the data lines, so I moved the cursor down a few lines below the first header line and typed the above.

What I was thinking was 'from here to the end' ('.,$'), 'match the header lines' (the /<regexp>/), 'delete the matching lines'. What had slipped out of my mind is that Vim doesn't match multiple lines when you use a bare regular expression this way; instead it goes to the first line that matches the regular expression. And instead of '.,$' and the regular expression being two separate steps, what I'd created was a compound single address, that went from '.' (current position) to '$/<regexp>/', the first line matching the regular expression (starting at the end of the file, which I believe defaults to rolling around to the start of the file). Since the first instance of the regular expression was at the start of the file, this meant that the range was backward, hence Vim's prompt.

What I actually was thinking of was the :g[lobal] command, and after I realized what I'd done wrong, that's what I used. I've used ':g' before, but this may have been the first time I've used it with a range (at least recently), and evidently my mind remembered the idea but forgot the important bit of the actual 'g'. So the real version I used, after I realized it, was only different by the addition of that one crucial character that made all of the difference:


(I'd blame my sam reflexes but even in sam this would have required a filter operation. It also would have been somewhat more annoying because sam isn't line based so I'd have had to make sure that the regexp covered the entire line or something of that order.)

unix/VimAddressingMistake written at 22:01:54; Add Comment


Unix's (technical) history is mostly old now

Yesterday I wrote about how Unix swap configuration used to be simple and brute force, covering a number of cases from V7 Unix through Linux 0.92c. As I wrote that entry, it became increasingly striking to me that the most recent time I mentioned was 1992. This isn't something unique to swap handling, or new in my entries about much of the (technical) origins and evolution of Unix. Instead, it's because a lot of Unix's technical history is at least thirty years old now.

It's not quite the case that nothing has happened in Unix history since the early 1990s. Very obviously, quite a lot of important social things happened around 'Unix', such that by the end of the 1990s what Unixes people used had changed significantly (and then in the 00s the change became drastic). Less obviously, a bunch of internal kernel technology changed over that time, so that today every remaining common Unix has good SMP and in a far better place for performance.

To some degree, technical evolution has also continued in filesystems. The problem is that this evolution is very unevenly distributed, with the most advanced filesystems the least widely used. Unix has made valuable strides in commonly used filesystems, but they aren't drastic ones. And the filesystem related features visible to people using Unix haven't really changed since the early 1990s, especially in common use (there has been no large move to adopt ACLs or file attributes, for example, although file capabilities have snuck into common use on Linux systems).

Some things that were known in the early 1990s but not very adopted have become pervasive, like having a /proc or interacting with your kernel for status information and tuning through a structured API instead of ad-hoc reading (and sometimes writing) kernel memory. However, these changes at least don't feel as big as previous evolutions. It's better that ps operates by reading /proc, but it's still ps.

I think that if you took a Unix user from the early 1990s and dropped them into a 2022 Unix system via SSH, they wouldn't find much that was majorly different in the experience. Admittedly, a system administrator would have a different experience; practices and tools have shifted drastically (for the better).

(It's possible that my perspective leaves me blinded to important things in Unix's technical history and evolution in 2010s, 2000s, and 1990s.)

unix/UnixHistoryMostlyOldNow written at 22:49:37; Add Comment


Unix swap configuration used to be rather simple and brute force

Modern Unixes generally support rather elaborate configuration of what swap space is available. FreeBSD supports multiple swap devices and can enable and disable them at runtime (cf swapon(8)), including paging things back in in order to let you disabling a swap device that's in use. Linux can go even further, allowing you to swap to files as well as devices (which takes a bunch of work inside the kernel). It will probably not surprise you to hear that early Unixes were not so sophisticated and featureful, and in fact were rather simple and brute force about things.

(It seems that under the right conditions, FreeBSD will also swap to files, cf the handbook section on creating a swap file.)

In V7, there was a single swap device and that device was hard-coded into the kernel at kernel compilation time, as swapdev in sys/conf/c.c (see sys/conf/mkconf.c). V7 similarly hard-coded the root filesystem device. It doesn't look like V7 had anything to control turning swap on and off; it was on all the time.

In the BSD line, 4.2 BSD had swapon(2) to (selectively) enable swapping, but had no way of turning swapping off on a device once you'd turned it on. It did now support swapping on multiple devices, but the (potential) swap devices were hard coded when you built the kernel, somewhat like V7 (cf eg GENERIC/swaphkvmunix.c and the various configuration files in sys/conf). As in V7, the root filesystem was also hard-coded. This relatively fixed set of swapping options continued through at least 4.3 Tahoe (based on reading manual pages).

Interestingly, swapping to a file goes a long way back in Linux; it's supported in 0.96c (from 1992), according to tuhs.org's copy of mm/swap.c. However, Linux 0.96c only supported a single swap area (whether it was a file or a device), and it doesn't look like you could turn swapping off once you turned it on.

I'm not sure how swap configuration worked in System V, especially before System V Release 4. It turns out that archive.org has System V source code available, and in SVR4 the kernel source code suggests that you can add and delete multiple swap devices and swap files, without any need to configure them into the kernel in advance. This may have been what inspired Linux to support swapping to a file so early on in its life, since System V Release 4 dates from the late 1980s.

(Writing all of this down has gotten me to realize just how long ago all of it was. Unix has had pretty capable swap support for more than 30 years now, if you start from System V Release 4 or Linux.)

unix/SwapSetupWasSimple written at 23:12:25; Add Comment


Twitter's 'quoted tweets' feature and how design affects behavior

Twitter's 'quote tweets' feature is back in the news in my circles because the Fediverse's Mastodon software famously deliberately doesn't have them. I find 'quote tweets' to be a fascinating example and case of how what looks like relatively neutral technical or design 'solutions' can drastically change the social side of a service. But to understand this, I need to cover the path that Twitter took to having quote tweets, because they didn't spring out of nowhere.

In the beginning, Twitter had no quoted tweets at all. However, people still wanted to do things like discuss things that other people had said or point their followers to something with additional commentary. So people did the obvious thing; they wrote a new tweet and linked to the original. Like this:

I find this argument for abandoning UTC leap seconds to be interesting but ultimately wrong-headed. http://twitter.com/auser/....

If you were sufficiently interested in whatever the tweet was talking about, you could follow the link and read it, but otherwise you were probably flying pretty blind. My vague memory is that people did this every so often but not very much.

Later, the web version of Twitter got a general link preview feature that I believe is (still) called 'cards'. If a tweet has a link, Twitter will put a little snippet, preview, or whatever of the link target below the tweet itself, so you can see something of where you'll be going before you click (and maybe you won't click at all, especially as Twitter will do things like play a Youtube video inline). Naturally, if your link was to a tweet, Twitter would basically inline the tweet in the card. This pretty much created the visual presentation of a quoted tweet even if you still created them by hand, and my memory is that this made them rather more popular since when you linked to a tweet this way, people could see what you were reacting to or commenting on right away, making it far less opaque and more interesting.

Then finally Twitter decided that enough people were quoting tweets this way that they would make it an actual feature, 'Quote Tweet'. Although the visual appearance of the result didn't change much (or maybe at all), the actual feature made it much easier to actually do, especially on smartphones (I'm not sure if it changed what notifications the original tweet author got, but it may have). Naturally this significantly increased use of the general feature and led to a situation where many people consider it to contribute to negative behavior (cf Mastodon's reasons for not having an equivalent). What had once been a relatively esoteric and little used thing suddenly became a common thing, even the source of memes (eg various 'quote tweet this with ...').

(I suspect that making it much easier to quote tweets on smartphones was a big part of increasing their usage, since my understanding is that a significant amount of Twitter usage is from smartphones.)

Each step in this evolution is reasonable and appealing to people using Twitter in isolation, and is probably not large in either technology or design (if you accept the general idea of cards). But the end result is a quite different social experience.

(I'm sure that this has happened in other systems. But Twitter's step by step evolution from extremely minimal beginnings makes it visible and fascinating this way, especially as the early people using it came up with many of the core ideas that were later implemented as features.)

tech/TwitterQuoteTweetsPath written at 21:22:08; Add Comment


Using curl to test alternate (test) servers for a web site

One of the perpetual issues in system administration is that we have a new version of some web site of ours to test, for example because we're upgrading the server's operating system from Ubuntu 18.04 to 22.04. In many cases this is a problem because the web server's configuration wants to be 'example.org' but you've installed it on 'test.internal' because 'example.org' points to your current production server. Two traditional approaches to this are to modify your local /etc/hosts (or equivalent) to claim that 'example.org' has the IP address of 'test.internal', or to change the web server's Apache (or nginx or etc) configuration so that it believes it's 'test.internal' (or some suitable name) instead of 'example.org'.

As I learned today, curl has an option to support this sort of mismatch between the server's official name and where it actually is. Actually it has more than one of them, but let's start with --resolve:

curl --resolve example.org:443:<IP of server> https://example.org/

As covered in the curl manual page, the --resolve option changes the IP address associated with a given host and port combination. For HTTP requests, this affects both the HTTP Host header and the TLS SNI (for HTTPS connections). You can give multiple --resolve options if you want to, and it takes wildcards so you can do slightly crazy things like:

curl -L --resolve *:443:<server-IP> https://example.org/redir

As mentioned by Eric Nygren, this can also be used to test an IPv6 IP before you publish it in DNS (or an alternate IPv4 IP).

Curl also has the --connect-to option, which is potentially more powerful although somewhat more verbose. It has two options that I can see, which is that it will take a host name instead of an IP address and that you can change the port (which you might need to do in order to talk to some backend server). You can wildcard everything, although in a different syntax than with --resolve, so our two examples are:

curl --connect-to example.org:443:test.internal: https://example.org/
curl -L --connect-to :443:test.internal: https://example.org/redir

You can also omit the original port, for example if you want to test HTTP to HTTPS redirection on your new test server:

curl -L --connect-to example.org::test.internal: http://example.org/

Having learned the distinction, I'll probably mostly use --connect-to because while it's slightly longer and more complicated, it's also more convenient to be able to use the test server's hostname instead of having to keep looking up its IP.

For more reading, there's Daniel Stenberg's curl another host, which also covers merely changing the Host: header.

As far as I know, curl has no option to specifically change the TLS SNI by itself, although possibly you could achieve the same effect by artfully combining --resolve (or --connect-to) and explicitly setting the Host: header. Probably there's no case where you'd want to do this (or Apache would let you do it). You can always use curl's --insecure option to ignore TLS certificate errors.

web/CurlTestingAlternateServer written at 21:52:58; Add Comment


Floating point NaNs as map keys in Go give you weird results

The last time around I learned that Go 1.21 may have a clear builtin partly because you can't delete map entries that have a floating point NaN as their key. Of course this isn't the only weird thing about NaNs as keys in maps, although you can pretty much predict all of them from the fact that NaNs never compare equal to each other.

First, just like you can't delete a map entry that has a key of NaN, you can't retrieve it either, including using the very same NaN that was used to add the entry:

k := math.NaN()
m[k] = "Help"
v, ok := m[k]

At this point v is empty and ok is false; the NaN key wasn't found in the map although there is an entry with a key of NaN.

Second, you can add more than one entry with a NaN key, in fact you can add the same NaN key to the map repeatedly:

m[k] = "Help"
m[k] = "Me"
m[k] = "Out"
// Now m has at least three entries with NaN
// as their key.

You can retrieve the values of these entries only by iterating both the keys and values of the map with a ranged for loop:

for k, v := range m {
  if math.IsNaN(k) {
    fmt.Println(k, v)

If you iterate only the keys, you run into the first issue; you can't use the keys to retrieve the values from the map. You have to extract the values directly somehow.

I don't think this behavior is strictly required by the Go specification, because the specification merely talks about things like 'if the map contains an entry with key x' (cf). I believe this would allow Go to have maps treat NaNs as keys specially, instead of using regular floating point equality on them. Arguably the current behavior is not in accordance with the specification (or at least how people may read it); in the cases above it's hard to say that the map doesn't contain an entry with the key 'k'.

(Of course the specification could be clarified to say that 'with key x' means 'compares equal to the key', which documents this behavior for NaNs.)

Incidentally, Go's math.NaN() currently always returns the same NaN value in terms of bit pattern. We can see this in src/math/bits.go. Unsurprisingly, Go uses a quiet NaN. Also, Go defers to the CPU's floating point operations to determine if a floating point number is a NaN:

// IEEE 754 says that only NaNs satisfy f != f.
// [...]
return f != f

If you want to manufacture your own NaNs with different bit patterns for whatever reason and then use them for something, see math.Float64frombits() and math.Float64bits() (learning the bit patterns of NaNs is up to you).

programming/GoNaNsAsMapKeys written at 22:03:02; Add Comment


Python dictionaries and floating point NaNs as keys

Like Go, Python's floating point numbers support NaNs with the usual IEEE-754 semantics, including not comparing equal to each other. Since Python will conveniently produce them for us, we can easily demonstrate this:

>>> k = float('nan')
>>> k == k
>>> k is k

Yesterday, I discovered that Go couldn't delete 'NaN' keys from maps (the Go version of dicts). If you initially try this in Python, it may look like it works:

>>> d = {k: "Help"}
>>> d
{nan: 'Help'}
>>> d[k]
>>> del d[k]

However, all is not what it seems:

>>> d = {k: "Help", float('nan'): "Me"}
>>> d
{nan: 'Help', nan: 'Me'}
>>> d[float('nan')]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: nan

What's going on here is that Python dict indexing has a fast path for object identity, which comes into play when you look up something using exactly the same object that you used to set an entry. When you set a dict entry, Python saves the object you used as the key. If you ask a dict to look up an entry using that exact object, Python doesn't even bother calling the object's equality operation (what would be used for an '==' check); it just returns the value. This means that floating point NaNs have no chance to object that they're never equal to each other, and lookup will succeed. However, if you use a different object that is also a NaN, the lookup will fail because two NaNs never compare equal to each other.

This use of object identity in dict lookups does mean that the Python equivalent of iterating a Go map will always work:

>>> for k in d.keys():
...   d[k]

When you ask a dictionary for its keys, you of course get the literal Python objects that are the keys, which can always be used to look up the corresponding entry in the dict even if they're NaNs or otherwise uncomparable or inequal under normal circumstances.

One of the other things that this starts to show us is that Python is not making any attempt to intern NaNs, unlike things like True, False, and small integers. Let's show that more thoroughly:

>>> k2 = float('nan')
>>> k is k2
>>> k is math.nan

It might be hard to make all NaNs generated through floating point operations be the same interned object, but it would be relatively straightforward to make 'float("nan")' always produce the same Python object and for that Python object to also be math.nan. But Python doesn't do either of those; every NaN is a unique object. Personally I think that this is the right choice (whether or not it's deliberate); NaNs are supposed to all be different from each other anyway, so using separate objects is slightly better.

(I suspect that Python doesn't intern any floating point numbers, but I haven't checked the source code. On a quick check it doesn't intern 0.0 or +Inf; I didn't try any others. In general, I expect that interning floating point numbers makes much less sense and would result in much less object reuse than interning small integers and so on does.)

python/DictsAndNaNKeys written at 22:12:29; Add Comment


Go 1.21 may have a clear(x) builtin and there's an interesting reason why

Recently I noticed an interesting Go change in the development version, which adds type checking of a 'clear' builtin function. This was the first I'd heard of such a thing, but the CL had a helpful link to the Go issue, proposal: spec: add clear(x) builtin, to clear map, zero content of slice, ptr-to-array #56351. The title basically says what it's about, but it turns out that there's a surprising and interesting reason why Go sort of needs this.

On the surface you might think that this wasn't an important change, because you can always do this by hand even for maps. While there's no built in way to clear a map, you can use a for loop:

for k := range m {
   delete(m, k)

This for loop is less efficient than clearing a map in one operation, but it turns out that there is a subtle tricky issue that makes it not always work correctly. That issue is maps with floating point NaNs as keys (well, as the value of some keys). The moment a NaN is a key in your map, you can't delete it this way.

(Really, you can see it in this playground example using math.NaN().)

The cause of this issue with NaNs in maps is that Go follows the IEEE-754 standard for floating point comparison, and under this standard a NaN is never equal to anything, even another NaN or even itself. Although formally speaking delete() isn't defined in terms of specific key equality (in general maps don't quite specify things like that), in practice it works that way. Since delete() is implicitly based on key equality and NaNs never compare equal to each other, delete() can never remove key values that are NaNs, even if you got the key value from a 'range' over the map.

Of course you're probably not going to deliberately use NaN as a key value in a map. But you may well use floating point values as keys, and NaNs might sneak into your floating point values in all sorts of ways. If they do, and you have code that clears a map this way, you're going to get a surprise (hopefully a relatively harmless one). There's no way to fix this without changing how maps with floating point keys handle NaNs, and even that opens up various questions. Adding a clear() builtin is more efficient and doesn't open up the NaN can of worms.

This NaN issue was a surprise to me. Had you asked me before I'd read the proposal, I would have expected 'clear()' to be added only for efficiency and clarity. I had no idea there was also a correctness reason to have it.

(While the current change will likely be in Go 1.20, a clear builtin likely won't appear until Go 1.21 at the earliest.)

PS: If you suspect that this implies interesting and disturbing things for NaNs as key values in maps, you're correct. But that's for another entry.

programming/GoFutureClearBuiltin written at 23:36:52; Add Comment


Understanding how fast Ethernet really is (and in what units)

I've been around computer networking for a long time, but in all of that time I've never fully understood what the speed numbers of the various Ethernet standards actually meant. I knew what sort of bandwidth performance I could expect from '1 Gigabit' Ethernet as measured on TCP streams, but I didn't know exactly what '1G' meant, beyond that it was measured in bits (for example, was it powers of ten G or powers of two G). My quiet confusion was likely increased by the numbers that the Prometheus host agent reports, where a Gigabit Ethernet interface is reported as '125,000,000'.

('1G' in powers of ten decimal is 1,000,000,000. I'm writing it out with commas because otherwise comparing the two numbers by eye is painful.)

The Wikipedia page on Gigabit Ethernet starts out a bit confusing (for people like me) by talking about how the physical link generally uses '8b/10b encoding', which inflates the line rate by 25% over the data rate. However, what is '1G' about 1G Ethernet is the data rate, not the raw on the wire rate. That data rate is '1000 Mbit/s' (per the Wikipedia page), which tells us that we're dealing with powers of ten bit rates. When you divide that 1000 Mbit/s of data by 8 (to convert to bytes), you get 125 (decimal) Mbytes/s, and the network interface rate reported by the Prometheus host agent now makes sense.

(As a practical matter this means that the 8b/10b encoding is something we ignore. It's a coincidence that the line rate of '1250 Mbit/s' looks similar to the '125 Mbytes/s' data rate.)

If we then convert from powers of ten to powers of two, we would expect a theoretical bandwidth of around 119 MiBytes/sec (sort of using the official binary prefix notation). We're not going to get that bandwidth at the TCP data level, because this doesn't account for the overhead of either Ethernet framing or TCP itself.

I'm not as clear about how 10G Ethernet is encoded on the wire, but it doesn't matter for this because what we care about is still the data rate. 10G Ethernet has a data rate of 10,000 Mbit/s ('10 Gbit/s' in Wikipedia's list of interface bit rates), which translates to 1,250 Mbytes/sec (power of ten). This theoretically maps to 1192 MiBytes/s (ten times 1G Ethernet), but you're never going to get that bandwidth for TCP data flows because of the assorted overheads.

(This of course explains the Prometheus host agent's report that the 10G network speed is '1,250,000,000', and similarly that the 100 Mbit speed is '12,500,000'. All of which are much clearer when written out with commas for separation, or whatever character is used for this in your locale.)

tech/EthernetHowFastIsIt written at 23:15:49; Add Comment


Monitoring if our wireless network is actually working in locations

We provide a multi-building wireless network to the department (in addition to the university wide wireless network provided by the central IT people). For reasons beyond the scope of this entry, we don't have much innate visibility into how this network is doing in the various places it exists. This means that our only current ways of finding out whether or not it's currently working right somewhere are to wait for people to notice and then report problems (which doesn't always happen) or going there ourselves. We've recently decided that we'd like to do better than this.

The obvious thing to do is to put some boxes on the wireless network at various (physical) locations, and then have our monitoring system check to see that they're still visible (for example, through pinging them). This is fairly brute force but it's also a good end to end test; if we can reach these boxes over the wireless network, they can see the network, associate with it, and contact our DHCP server to lease addresses.

We need these boxes to be inexpensive, because to be useful they'll have to be in relatively exposed locations where they might walk off. We have strong opinions that they need to be wall powered, not battery powered, because we don't want to be going around every year or so to re-find them all and replace their batteries with a new set. Ideally they'd do something useful, like report the ambient temperature around them. Unfortunately we haven't found anything that combines these three attributes together; inexpensive wifi temperature sensors seem to all be battery powered, for example.

What we're currently experimenting with is wifi controlled smart power plugs. These tick two out of the three ideal features; they're generally inexpensive, and they're powered by the wall instead of batteries. They're also available from reasonably reputable brands, which helps give us more assurance that a unit isn't going to burst into flames some day, although in actual deployment I suspect that we'll tape over their power outlet with a 'do not use' label.

(If you get the right model, you can even get Unix tools to talk to it.)

We've only just put a couple of these wifi smart power plugs on our network, but so far they do seem to ping consistently. We'll have to see if that keeps up, but I'm cautiously optimistic. Of course I'd love to discover a better option, but I honestly suspect that there isn't any (for example, I can see why hardware designers would want to avoid wall power if they can).

sysadmin/WirelessLivenessMonitoring written at 23:20:37; Add Comment

(Previous 10 or go back to November 2022 at 2022/11/15)

Page tools: See As Normal.
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.