Wandering Thoughts archives

2015-07-20

My brush with the increasing pervasiveness of smartphone GPS mapping

One of the things I do with my time is go bicycling with a local bike club. When you go on group bike rides, one of the things you generally want to have is directions for where the ride is going (if only to reassure yourself if you get separated from the group). When I started with the club back in 2006, these 'cue sheets' for rides were entirely a paper thing and entirely offline; you turned up at the start of the ride and the ride leader handed out a bunch of copies to anyone who wanted or needed one.

(By 2006 I believe that people were mostly creating new cue sheets in word processors and other tools, but some old ones existed only in scanned form that had been passed down through the years.)

Time rolled on and smartphones with GPS appeared. Various early adapters around the club started using smartphone apps to record their rides. People put these ride recordings online and other people started learning from them, spotting interesting new ways to get places and so on. Other people started taking these GPS traces and loading them on their own smartphones (and sometimes GPS devices) as informal guides to the route to supplement the official cue sheets. As time went on, some people started augmenting the normal online ride descriptions for upcoming rides with somewhat informal links to online GPS-based maps of the ride route.

Last year the club started a big push to put copies of the cue sheets online, and alongside the cue sheets it started digitizing many of the routes into GPS route files. For some of the rides, the GPS route files started being the primary authority for the ride's route; the printed cue sheet that the ride leader handed out at the start was generated from them. Finally, this year the club is really pushing people to print their own cue sheets instead of having the ride leader give them out at the start. It's not really hard to see why; even last year fewer and fewer people were asking for copies of the cue sheet at the start of rides and more and more people were saying 'I'm good, I've got the GPS information loaded into my smartphone'.

(This year, on the group rides I've lead I could hardly give out more than a handful of cue sheets. And usually not because people had already printed their own.)

It doesn't take much extrapolation to see where this is going. The club is still officially using cue sheets for now, but it's definitely alongside the GPS route files and more and more cue sheets are automatically generated from the GPS route files. It wouldn't surprise me if by five years from now, having a smartphone with good GPS and a route following app was basically necessary to go on our rides. There's various advantages to going to only GPS route files, and smartphones are clearly becoming increasingly pervasive. Just like the club assumes that you have a bike and a helmet and a few other things, we'll assume you have a reasonably capable smartphone too.

(By then it's unlikely to cost more than, say, your helmet.)

In one way there's nothing particularly surprising about this shift; smartphones with GPS have been taking over from manual maps in many areas. But this is a shift that I've seen happen in front of me and that makes it personally novel. Future shock is made real by being a personal experience.

(It also affects me in that I don't currently have a smartphone, so I'm looking at a future where I probably need to get one in order to really keep up with the club.)

tech/SmartphoneGPSSpreadForMe written at 23:04:54; Add Comment

The OmniOS kernel can hold major amounts of unused memory for a long time

The Illumos kernel (which means the kernels of OmniOS, SmartOS, and so on) has an oversight which can cause it to hold down a potentially large amount of unused memory in unproductive ways. We discovered this on our most heavily used NFS fileserver; on a server with 128 GB of RAM, over 70 GB of RAM was being held down by the kernel and left idle for an extended time. As you can imagine, this didn't help the ZFS ARC size, which got choked down to 20 GB or so.

The problem is in kmem, the kernel's general memory allocator. Kmem is what is called a slab allocator, which means that it divides kernel memory up into a bunch of arenas for different-sized objects. Like basically all sophisticated allocators, kmem works hard to optimize allocation and deallocation; for instance, it keeps a per-CPU cache of recently freed objects so that in the likely case that you need an object again you can just grab it in a basically lock free way. As part of these optimizations, kmem keeps a cache of fully empty slabs (ones that have no objects allocated out of them) that have been freed up; this means that it can avoid an expensive trip to the kernel page allocator when you next want some more objects from a particular arena.

The problem is that kmem does not bound the size of this cache of fully empty slabs and does not age slabs out of it. As a result, a temporary usage surge can leave a particular arena with a lot of unused objects and slab memory, especially if the objects in question are large. In our case, this happened to the arena for 'generic 128 KB allocations'; we spent a long time with around six in use but 613,033 allocated. Presumably at one time we needed that ~74 GB of 128 KB buffers (probably because of a NFS overload situation), but we certainly didn't any more.

Kmem can be made to free up these unused slabs, but in order to do so you must put the system under strong memory pressure by abruptly allocating enough memory to run the system basically out of what it thinks of as 'free memory'. In our experiments it was important to do this in one fast action; otherwise the system frees up memory through less abrupt methods and doesn't resort to what it considers extreme measures. The simplest way to do this is with Python; look at what 'top' reports as 'free mem' and then use up a bit more than that in one go.

(You can verify that the full freeing has triggered by using dtrace to look for calls to kmem_reap.)

Unfortunately triggering this panic freeing of memory will likely cause your system to stall significantly. When we did it on our production fileserver we saw NFS stall for a significant amount of time, ssh sessions stop for somewhat less time, and for a while the system wasn't even responding to pings. If you have this problem and can't tolerate your system going away for five or ten minutes until things fully recover, well, you're going to need a downtime (and at that point you might as well reboot the machine).

The simple sign that your system may need this is a persistently high 'Kernel' memory use in mdb -k's ::memstat but a low ZFS ARC size. We saw 95% or so Kernel but ARC sizes on the order of 20 GB and of course the Kernel amount never shrunk. The more complex sign is to look for caches in mdb's ::kmastat that have outsized space usage and a drastic mismatch between buffers in use and buffers allocated.

(Note that arenas for small buffers may be suffering from fragmentation instead of or in addition to this.)

I think that this isn't likely to happen on systems where you have user level programs with fluctuating overall memory usages because sooner or later just the natural fluctuation of user level programs is likely to push the system to do this panic freeing of memory. And if you use a lot of memory at the user level, well, that limits how much memory the kernel can ever use so you're probably less likely to get into this situation. Our NFS fileservers are kind of a worse case for this because they have almost nothing running at the user level and certainly nothing that abruptly wants several gigabytes of memory at once.

People who want more technical detail on this can see the illumos developer mailing list thread. Now that it's been raised to the developers, this issue is likely to be fixed at some point but I don't know when. Changes to kernel memory allocators rarely happen very fast.

solaris/KernelMemoryHolding written at 01:55:20; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.