2016-11-21
Link: RFC 6919: Further Key Words for Use in RFCs to Indicate Requirement Levels
If you read RFCs, you may know about the standard use of the MUST, SHOULD, and so on key words that come from RFC 2119. RFC 6919, issued April 1st 2013, adds some useful additional key words like "MUST (BUT WE KNOW YOU WON'T)", "REALLY SHOULD NOT", and the like.
By itself this would be amusing and interesting. But what really makes RFC 6919 rewarding to read is that it shows usage examples for each of its new key words that are drawn from existing RFCs. If you have much exposure to how RFCs are actually implemented in the field, this will make you alternate laughter and sad sighs. To quote myself from when I first saw it:
RFC 6919 makes me laugh but it's sad laughter. Such honesty had to be published Monday.
(I was reminded of RFC 6919 by @whitequark's tweet, and was actually surprised to discover that I've never linked to it here on Wandering Thoughts. So now I'm fixing that.)
What I'd like in Illumos/OmniOS: progressive crash dumps
One of our fileservers had a kernel panic today as we were adding some more multipathed iSCSI disks to it. This was unfortunate but not fatal; we caught the panic almost right away and fixed things relatively fast. Which is unfortunate in its own way and brings me to my wish.
You see, this was perhaps our most important and core fileserver. Everything depends on it and everything eventually goes out to lunch if and while it's down. And in our experience, in our environment, making an OmniOS crash dump takes ages and may not succeed at the end of that (we've sat through over half an hour of the process only to have it fail). There was absolutely no way we could afford to let this fileserver sit there for minutes or tens of minutes to see if maybe it could successfully write out a crash dump this time around, so we forced a power cycle on it in order to get it back into service. The result is that we got nothing out of the panic; we don't even have the stack backtrace (it doesn't seem to have gotten written anywhere durable).
So now what I wish OmniOS had is what I'll call progressive crash dumps. A progressive crash dump would proceed in layers of detail. First it would write out very basic details (like the panic stack dump or the kernel message log) in a compact form, right away; this should hopefully take almost no time. After that had been pushed to the dump device, it would write another layer with more information that takes some more time (maybe a complete collection of various core kernel data tables, like all kernel stacks and the process table). As time went on it would write out more and more data with more and more layers of detail; if you had enough time, it would end up writing out the full crash dump that you get today.
(Dumpadm's -c
argument doesn't have enough granularity to help,
especially on fileservers where almost all the memory is already
being consumed by kernel pages instead of user pages.)
Progressive crash dumps would insure that even if you had to reboot the machine early you would get some information; the longer you could afford to wait, the more information you'd get. And if the overall dump winded up failing or hanging, at least you would recover however many layers could be written intact (and hopefully the very basic layers would be good, simply because they are basic and so should be easy and reliable to dump).
(This is a complete blue sky wish. It would likely take a completely new dump format, new kernel dump code, and significant changes to get all of the dump tools to deal with it, all of which adds up to a lot of new code in an area that has to be extremely reliable under extreme conditions and that most people don't use very much anyways. Even if we had the money to help fund this sort of thing, there would be much higher priority Illumos things we'd care about, like our 10G Ethernet issues.)
I've wound up feeling tentatively enthusiastic about Python 3
I know myself, so I know that I'm prone to bursts of enthusiasm with things that start abruptly and then wear off later into more moderate and sensible views (or all the way down to dislike). In the past I've been quite down on Python 3, and even recently I was only kind of lukewarm on it, but for no really good reason I've lately wound up feeling pretty enthused about working in it.
Part of this is certainly due to my recent positive experience with it (and also), but I think it was building even before then. There was definitely a push from Eevee's Why should I use Python 3?, which left me feeling that there really were a number of interesting things in Python 3 that I'd kind of like to actually use; it may be the first thing that really sold me on Python 3 as having genuine attractions, instead of just being something that I''d have to put up with in the future.
I call this a tentative enthusiasm because it could burn out, not because I feel very tentative about it. Although I may be talking myself into it here, if I was starting a new Python program now I'd probably try to do it in Python 3 if that was practical (ie, if it didn't have to run on our OmniOS machines). If everywhere that DWiki ran had modern versions of Python 3, it'd be tempting to start a serious project to port it to Python 3 (going beyond my quick bring-up experiment to handle the tough issues, like bytes to Unicode conversions in the right places).
Unfortunately for my enthusiasm, I don't see much need for new Python code around here in the near future. I'm a sysadmin not a programmer, and beyond that we mostly prefer to write shell scripts. I tend to write at most a handful of new Python programs a year. I suppose that I could take some of my personal Python sysadmin programs and convert them to Python 3 for the experience, but that feels sort of make-work; there's no clear advantage to a straight conversion.
(The reason to convert DWiki itself to Python 3 is partly for longevity, since I already know I have to do it sometime, partly because I'd gain a lot of practical experience, and to be honest partly because it seems like an interesting challenge. Converting little utility programs is, well, a lot less compelling.)
PS: Part of this new enthusiasm is likely due to my slow shift into an attitude of 'let's not fight city hall, it takes too much work', as seen in my shift on Python indentation. Python 3 is the future of Python, so I might as well embrace it instead of bitterly clinging to Python 2 because I'm annoyed at the shift.
(Partly I'm writing this entry as a marker, so that I can later look back to see how I felt about things right now and maybe learn something from that.)