web/MyIfModifiedSinceHack written at 01:06:39; Add Comment
Giving in: pragmatic
If-Modified-Since handling for Tiny Tiny RSS
I wrote yesterday about how Tiny Tiny RSS drastically mishandles
If-Modified-Since headers for conditional GETs, but I didn't say anything about what my
response to it is. DWiki insists on strict equality checking between
If-Modified-Since and the
Last-Modified timestamp (for good
reasons), so Tiny Tiny RSS was basically
doing unconditional GETs all the time.
I could have left the situation like that, and I actually considered
it. Given the conditional GET irony I was
never saving any CPU time on successful conditional GETs, only
bandwidth, and I'm not particularly bandwidth constrained (either
here or potentially elsewhere; 'small' bandwidth allocations on
VPSes seem to be in the multiple TBs a month range by now). On the
other hand, these requests were using up quite a lot of bandwidth
because my feeds are big and Tiny Tiny RSS is quite popular, and
that unnecessary bandwidth usage irritated me.
(Most of the bandwidth that Wandering Thoughts normally uses
is in feed requests, eg today 87% of the bandwidth was for feeds.)
So I decided to give in and be pragmatic. Tiny Tiny RSS expects you
to be doing timestamp comparisons for
If-Modified-Since, so I added
a very special hack that does just that if and only if the user agent
claims to be some version of Tiny Tiny RSS (and various other conditions
apply, such as no
If-Not-Modified header being supplied). Looking at
my logs this appears to have roughly halved the bandwidth usage for
serving feeds, so I'm calling it worth it at least for now.
I don't like putting hacks like this into my code (and it doesn't fully
solve Tiny Tiny RSS's problems with over-fetching feeds either), but I'm
probably going to keep it. The modern web is a world full of pragmatic
tradeoffs and is notably lacking in high-minded purity of implementation.
web/IfModifiedSinceHowNot written at 02:19:46; Add Comment
How not to generate
If-Modified-Since headers for conditional GETs
Recently I looked through my syndication feed stats (as I periodically
do) and noticed that the Tiny Tiny RSS program was both responsible
for quite a lot of feed fetching and also didn't seem to ever be
successfully doing conditional GETs. Most
things in this situation aren't even attempting conditional GETs,
but investigation showed that Tiny Tiny RSS was consistently sending
If-Modified-Since header with times that were generally just a
bit after the actual
Last-Modified timestamp of the syndication
feed. For good reasons I require
strict equality of
If-Modified-Since values, so this insured that
Tiny Tiny RSS never made a successful conditional GET.
Since I was curious, I got a copy of the current Tiny Tiny RSS code and
dug into it to see where this weird
If-Modified-Since value was coming
from and if there was anything I could do about it. The answer was worse
than I was expecting; it turns out that the I-M-S timestamp that Tiny
Tiny RSS sends has absolutely nothing to do with the
value that I sent it. Where it comes from is that whenever Tiny Tiny
RSS adds a new entry from a feed to its database it records the (local)
time at which it did this, then the most recent such entry timestamp
If-Modified-Since value that Tiny Tiny RSS sends during
(You can see this in
update_rss_feed in include/rssfuncs.php in the
TT RSS source. Technically the time recorded for new entries is when TT
RSS started processing the updated feed, not the moment it added the
database record for a new entry.)
This is an absolutely terrible scheme, almost as bad as simply
generating random timestamps. There are a cascade of things that
can go wrong with it:
- It implicitly assumes that the clocks on the server and the client
are in sync, since
If-Modified-Since must be in the server's
time yet the timestamp is generated from client time.
- Tiny Tiny RSS loses if a feed publishes a new entry, TT RSS pulls the
feed, and then the feed publishes a second entry before TT RSS
finishes processing the first new entry. TT RSS's 'entry added'
timestamp and thus the
If-Modified-Since timestamp will be after
the revised feed's date, so the server will 304 further requests.
TT RSS will only pick up the second entry when a third entry is
published or the feed is otherwise modified so that its
date moves forward enough.
- If the feed deletes or modifies an entry and properly updates its
Last-Modified timestamp as a result of this, Tiny Tiny
RSS will issue what are effectively unconditional GETs until the
feed publishes a completely new entry (since the last time that
TT RSS saw a new entry will be before the feed's new
There are probably other flaws that I'm not thinking of.
(I don't think it's a specification violation to send an
If-Modified-Since header if you never got a
but if it is that's another flaw in this scheme, since Tiny Tiny RSS
will totally do that.)
This scheme's sole virtue is that on a server which uses timestamp
If-Modified-Since (instead of equality checks) it will
sometimes succeed in getting 304 Not Modified responses. Some of these
responses will even be correct and when they aren't really correct, it's
not the server's fault.
linux/SoftwareRaidShiftingMirrorII written at 02:05:37; Add Comment
An important additional step when shifting software RAID mirrors around
After going through all of the steps from yesterday's entry to move my mirrors from one disk to
another, I inadvertently discovered a vital additional step
you need to take here. The additional step is:
- After you've taken the old disk out of the mirror and shrunk the
mirror (steps 4 and 5), either destroy the old disk's RAID
superblock or physically remove the disk from your system.
I believe that RAID superblocks can be destroyed with the following
/dev/sdb7 is the old disk):
mdadm --zero-superblock /dev/sdb7
Failure to do this may cause your system to malfunction either subtly
or spectacularly on boot (malfunctioning spectacularly is best because
that insures you notice it). The culprit here is the how a modern
Linux system assembles RAID arrays on boot.
Put simply, there is nothing that forces all of your RAID arrays to be
assembled using your current mirrors instead of the obsolete mirrors
on your old disk. Instead it seems to come down to which device is
processed first. If a partition on your old disk is processed first, it
wins the race and becomes the sole member of the RAID array (which
may then fail to activate because it doesn't have the full device set). If you're lucky your system now refuses
to boot; if you're unlucky, your system boots but with obsolete and
unmirrored filesystems and anything important written to them will cause
you a great deal of heartburn as you try to sort out the resulting mess.
(Linux software RAID appears to be at least smart enough to know that
your two current mirror devices and the old disk are not compatible and
so doesn't glue them all together. I don't know what GRUB's software
RAID code does here if your boot partition is on a software RAID mirror
that has had this happen to it.)
This points out core architectural flaws in both the asynchronous
assembly process and the approach of removing obsolete devices by
failing them first. If
mdadm had a 'remove active device' operation,
it could at least somehow mark the removed device's superblock as
'do not use to auto-assemble array, this device has been explicitly
removed'. If the assembly process was not asynchronous the way it is,
it could see that some mirror devices were more recent than others and
prefer them. But sadly, well, no.
(In theory a not yet activated software RAID array could be revised to
kick out the out of date device and replace it with the newer device
(although there are policy issues involved). This can't be done at all
once the array has been activated, or rather while the array is active.)
linux/SoftwareRaidShiftingMirror written at 19:51:05; Add Comment
Shifting a software RAID mirror from disk to disk in modern Linux
Suppose that you have a software RAID mirror and you want to migrate one
side of the mirror from one disk to another to replace the old disk.
The straightforward way is to remove the old disk, put in the new disk,
and resync the mirror. However this leaves you without a mirror at all
for the duration of the resync so if you can get all three disks online
at once what you'd like to do is add the new disk as a third mirror and
then remove the old disk later. Modern Linux makes this a little bit
The core complication is that your software RAID devices know how many
active mirrors they are supposed to have. If you add a device beyond
that, it becomes a hot spare instead of being an active mirror. To activate
it as a mirror you must add it then grow the number of active devices in the
mirror. Then to properly deactivate the old disk you need to do the reverse.
Here are the actual commands (for my future use if nothing else):
- Hot-add the new device:
mdadm -a /dev/md17 /dev/sdd7
If you look at
/proc/mdstat afterwards you'll see it marked as a spare.
- 'Grow' the number of active devices in the mirror:
mdadm -G -n 3 /dev/md17
- Wait for the mirror to resync. You may want to run the new disk in
parallel with the old disk for a few days to make sure that all is
well with it; this is fine. You may want to be wary about reboots
during this time.
- Take the old disk out by first manually failing it and then actually
mdadm --fail /dev/md17 /dev/sdb7
mdadm -r /dev/md17 /dev/sdb7
- Finally, shrink the number of active devices in the mirror down to two
mdadm -G -n 2 /dev/md17
You really do want to explicitly shrink the number of active devices
in the mirror. A mismatch between the number of actual devices and the
number of expected devices can have various undesirable consequences. If a significant amount of time happened
between step three and four, make sure that your
mdadm.conf still has
the correct number of devices configured in it for all of the arrays
Unfortunately marking the old disk as failed will likely get you warning
mdadm's status monitoring about a failed device. This is
the drawback of
mdadm not having a way to directly do 'remove an
active device' as a single action. I can understand why
have an operation for this, but it's still a bit annoying.
(Looking at this old entry makes it clear that I've
run into the need to grow and shrink the number of active mirror devices
before, but apparently I didn't consider it noteworthy at that point.)
sysadmin/UncertaintyScariness written at 00:34:47; Add Comment
The scariness of uncertainty
One of the issues that I'm facing right now (and have been for a while)
is that being uncertain can be a daunting thing. As sysadmins we deal
with uncertainty all of the time, of course, and if we were paralyzed
by it in general we'd never get anywhere. It's usually easy enough to
overcome uncertainty and move forward in small situations or important
situations (for various reasons). Where uncertainty can dig in is in
dauntingly big and complex projects that are not essential. If you don't
have to have whatever and building anything is clearly a lot of work for
an uncertain reward, it's very easy to defer and defer action in favour
of various stalling measures (or other work).
All of this sounds rather hand waving, so let me tell you about my
project with gathering OS level performance statistics. Or rather my
If you look around, there are a lot of options for gathering,
aggregating, and graphing OS performance stats (in tools, full systems,
and ecologies of tools). Beyond a certain basic level it's unclear
which ones of them are going to work best for us and which ones will be
crawling failures, but at the same time it's also clear that any of them
that look good are going to take a significant amount of work and time
to set up and try out (and I'm going to have to try them in production).
As a result I have been circling around this project for literally
years now. Every so often I poke and prod at the issue; I read more
about some tool or another, I look at pretty pictures, I hear about
something new, and so on and so forth. But I've never sat down to
really do something. I've always found higher priority things to do
or other excuses.
(Here in the academy this behavior in graduate students is well known
and gets called 'thesis avoidance'.)
The scariness of uncertainty is not the only reason for this, of course,
but it's a significant contributing factor. In a way it raises the
stakes for making a choice.
(The uncertainty comes from two directions. One is simply trying to
select which system to use; the other is whether not the whole idea is
going to be worthwhile. The latter is a bit stupid since we're probably
not going to be left with a white elephant of a system that we ignore
and then quietly abandon, but the possibility gnaws at me and feeds
other uncertainties and doubts.)
I don't have any answers, but maybe writing this entry has made it
more likely that I do something here. And maybe I should embrace the
possibility of failure as a sign that I am finally taking enough risk.
(I feel divided about that idea but I need to think about it more and
then write another entry on it.)
solaris/ZFSNoAPIAnger written at 00:12:03; Add Comment
I'm angry that ZFS still doesn't have an API
Yesterday I wrote a calm rational explanation for why I'm not building
tools around '
zpool status' any more and said
that it ended up being only half of the story. The other half is that I
am genuinely angry that ZFS still does not have any semblance of an API,
so angry that I've decided to stop cooperating with ZFS's non-API and
make my own.
(It's not the hot anger of swearing, it's the slow anger of a blister
that keeps reminding you about its existence with every step you take.)
For at least the past six years it has been blindingly obvious that
ZFS should have an API so that people could build additional tools and
solutions on top of it. For all that is sane, stock ZFS doesn't even
have an alerting solution for pool problems. You can't miss that
unless you're blind and say whatever you want about the ZFS developers,
I'm sure that they're not blind. I am and have been completely agnostic
about the exact format that this API could have taken, so long as it
existed. Stable, documented, script-friendly output from ZFS tools? A
documented C level library API? XML information dumps because everyone
loves XML? A web API? Whatever. I could have worked with any of them.
Instead we got nothing. We got nothing when ZFS was with Sun and despite
some vague signs of care we continue to get exactly nothing now that
ZFS is effectively with Illumos (and I'm pretty sure that Oracle hasn't
fixed the situation either). At this point it is clear that the ZFS
developers have different priorities and in an objective sense do not
care about this issue.
(Regardless of what you say, what you actually care about is shown by
what you work on.)
This situation has thoroughly gotten under my skin now that moving to
OmniOS is rubbing my nose in it again. So now I'm through with tacitly
cooperating with it by trying to wrestle and wrangle the ZFS commands
to do what I want. Instead I feel like giving '
zpool status' and
its friends a great big middle finger and then throwing them down a
well. The only thing I want to use them for now is as a relatively
authoritative source of truth if I suspect that something is wrong with
what my own tools are showing me.
zpool status et al 'relatively authoritative' because
it and other similar commands leave things out and otherwise mangle
what you are seeing, sometimes in ways that cause real problems.)
I will skip theories about why the ZFS developers did not develop
an API (either in Sun or later), partly because I am in a bad mood
after writing this and so am inclined to be extremely cynical.
solaris/ZFSNoMoreZpoolStatus written at 23:22:29; Add Comment
I'm done with building tools around '
zpool status' output
Back when our fileserver environment was young,
I built a number of local tools and scripts that relied on '
status' to get information about pools, pool states, and so on. The
problem with using '
zpool status' is of course that it is not an API,
it's something intended for presentation to users, and so as a result
people feel free to change its output from time to time. At the time
zpool's output seemed like the best option despite this, or more
exactly the best (or easiest) of a bad lot of options.
Well, I'm done with that.
We're in the process of migrating to OmniOS. As I've had to touch
scripts and programs to update them for OmniOS's changes in the output
zpool status', I've instead been migrating them away from using
zpool at all in favour of having them rely on a local ZFS status
reporting tool. This migration isn't complete
(some tools haven't needed changes yet and I'm letting them be), but
it's already simplified my life in various ways.
One of those ways is that now we control the tools. We can guarantee
stable output and we can make them output exactly what we want. We
can even make them output the same thing on both our current Solaris
machines and our new OmniOS machines so that higher level tooling is
insulated from what OS version it's running on. This is very handy and
not something that would be easy to do with '
The other, more subtle way that this makes my life better is that I now
have much more confidence that things are not going to subtly break on
me. One problem with using
zpool's output is that all sorts of things
can change about it and things that use it may not notice, especially
if the output starts omitting things to, for example, 'simplify' the
default output. Since our tools are abusing private APIs they may well
break (and may well break more than
zpool's output), but when they
break we can make sure that it's a loud break. The result is much more
binary; if our tools work at all they're almost certainly accurate. A
script's interpretation of
zpool's output is not necessarily so.
(Omitting things by default is not theoretical. In between S10U8 and
zfs list' went from including snapshots by default to
excluding them by default. This broke some of our code that was parsing
zfs list' output to identify snapshots, and in a subtle way; the
code just thought there weren't any when there were. This is of course
a completely fair change, since '
zfs list' is not an API and this
probably makes things better for ordinary users.)
I accept that rolling our own tools has some additional costs and has
some risks. But I'd rather own those costs and those risks explicitly
rather than have similar ones arise implicitly because I'm relying on a
necessarily imperfect understanding of
Actually, writing this entry has made me realized that it's only half of
the story. The other half is going to take another entry.
programming/WhyIRejectPatches written at 02:09:04; Add Comment
Why I sometimes reject patches for my own software
I recently read Drew Crawford's Conduct unbecoming of a hacker (via), which
argues that you should basically always accept other people's patches
for your software unless they are clearly broken. Lest we bristle at
this, he gives the example of Firefox and illustrates how many patches
On the whole I sympathize with this view, and I've even had some
pragmatic experience with it; a patch to mxiostat that I wasn't very
enthusiastic about initially has actually become something I use
routinely. But despite this there are certain sorts of patches I
will reject basically out of hand. Put simply they're patches that
I think will make the program worse for me, no matter how much
they might help the author of the patch (or other people).
This is selfish behavior on my part, but so far all of my public
software is things that I'm ultimately
developing for myself first. It's nice if other people use my programs
too but I don't expect any of them to get popular enough that other
people's usage is going to be my major motivation for maintaining and
developing them. So my priorities come first and the furthest I'm
willing to go is that I'll accept patches that don't get in the way of
(Drew Crawford's article has sort of convinced me that I should be more
liberal about accepting patches in general; he makes a convincing case
for 'accept now, bikeshed later'. So far this is mostly a theoretical
issue for my stuff.)
By the way, this would obviously be different if I was developing things
with the explicit goal of having them used by other people. In that case
I should (and hopefully would) suck it up and put the patch in unless I
had strong indications that it would make the program worse for a bunch
of people instead of just me. Maybe someday I'll write something like
that, but so far it's not the case.
spam/FutureSpamFilteringWorry written at 02:26:21; Add Comment
One of my worries: our spam filtering in the future
I've mentioned in the past that we rely on a commercial anti-spam system
for our spam filtering. What I haven't mentioned is that it isn't
supported on and doesn't run on any version of Ubuntu after Ubuntu
10.04 LTS. 10.04 is now rather long in the tooth and with the impending
release of Ubuntu 14.04 it will fall out of support in a bit over a
year. This doesn't leave us completely up the creek, as the vendor
supports Red Hat Enterprise 6, but it does raise a concern: is the
vendor still actually interested in this product?
(It's not as if the vendor is deliberately ignoring Ubuntu; the
most recent Linux distribution that the vendor supports was released
in 2011 (and that's Debian 6).)
Since I do have this concern, every so often I get to worry about how
we'd replace this commercial package (either because of the vendor
effectively dropping it or because of licensing problems, which have
been known to happen). Right now the commercial
system has three great virtues: it works quite well, it doesn't require
any administration, and it's basically a black box. I suppose that it
doesn't really cost us any money is a fourth virtue.
(The university has a site license, the costs for which are covered
by the central mail system.)
There are probably other commercial options, but I don't know how
much they'd cost or how well they work, and the thought of trying to
evaluate the alternatives fills me with dread. I know that there are
free alternatives (for both anti-spam and anti-virus stuff) but I
suspect that they are not hands free and automatically maintained black
boxes and I don't know how well they work. Evaluating the free options
would be somewhat less of a hassle than evaluating commercial options
(with free options there is no wrestling with vendors) but it wouldn't
be a picnic either.
One part of me thinks that I should spend some time on keeping current
with at least the free options for anti-spam filtering, just so I can be
prepared if the worst happens. Another part of me thinks that that's a
lot of work with no immediate payoff (in fact that doing the work now is
probably a complete waste of time) and that I should defer it until we
know we need a different anti-spam system, if ever.
I don't have any answers right now, just worries. So there you go.
linux/Fedora20LVMDriveRecovery written at 18:05:00; Add Comment
Recovering from a drive failure on Fedora 20 with LVM on software RAID
My office workstation runs on two mirrored disks. For various reasons
the mirroring is split; the root filesystem, swap, and
directly on software RAID while things like my home directory
filesystem are on LVM on top of software RAID. Today I had one of
those two disks fail when I rebooted after applying a kernel upgrade;
much to my surprise this caused the entire boot process to fail.
The direct cause of the boot failure was that none of the LVM-based
filesystems could be mounted. At first I thought that this was just
because LVM hadn't activated, so I tried things like
pvscan; much to
my surprise and alarm this reported that there were no physical volumes
visible at all. Eventually I noticed that the software RAID array that
LVM sits on top of being reported as
inactive instead of active and
that I couldn't read from the
/dev entry for it.
The direct fix was to run '
mdadm --run /dev/md17'. This activated the
array (and then udev activated LVM and systemd noticed that devices were
available for the missing filesystems and mounted them). This was only
necessary once; after a reboot (with the failed disk still missing) the
array came up fine. I was led to this by the description of
Attempt to start the array even if fewer drives were given than were
present last time the array was active. Normally if not all the
expected drives are found and
--scan is not used, then the array
will be assembled but not started. With
--run an attempt will be
made to start it anyway.
In theory this matched the situation; the last time the array was active
it had two drives and now it only had one. The mystery here is that the
exact same thing was true for the other mirrors (for
/, swap, and
/boot) and yet they were activated anyways despite the missing drive.
My only theory for what happened is that something exists that forces
activation of mirrors that are seen as necessary for filesystems but
doesn't force activation of other mirrors. This something is clearly
magical and hidden and of course not working properly. Perhaps this
magic lives in
mount (or the internal systemd equivalent); perhaps it
lives in systemd itself. It's pretty much impossible for me to tell.
(Of course since I have no idea what component is responsible I have no
particularly good way to report this bug to Fedora. What am I supposed
to report it against?)
(I'm writing this down partly because this may sometime happen to my
home system (since it has roughly the same configuration) and if I
didn't document my fix and had to reinvent it I would be very angry at
These are my WanderingThoughts
(About the blog)
Full index of entries
This is part of CSpace, and is written by ChrisSiebenmann.
* * *
Categories: links, linux, programming, python, snark, solaris, spam, sysadmin, tech, unix, web
This is a DWiki.