|
2013-05-08
Thoughts on when to replace disks in a ZFS pool
One of the morals that you can draw from our near miss that I described
in yesterday's entry, where we might have lost a
large pool if things had gone a bit differently, is that the right
time to replace a disk with read errors is TODAY. Do not wait. Do
not put it off because things are going okay and you see no ZFS-level
errors after the dust settles. Replace it today because you never know
what is going to happen to another disk tomorrow.
Well, maybe. Clearly the maximally cautious approach is to replace
a disk any time it reports a hard read error (ie one that is seen
at the ZFS layer) or SMART reports an error. But the problem with
this for us is that we'd be replacing a lot of disks and at least
some of them may be good (or at least perfectly workable). For read
errors, our experience is that some but not all reported read errors are
transient errors in that they don't happen again if you do something
like (re)scrub the pool. And SMART error reports seem relatively
uncorrelated with actual errors reported by the backend kernels or seen
by ZFS.
In theory you could replace these potentially questionable disks, test
them thoroughly, and return them to your spares pool if they pass your
tests. In practice this would add more and more questionable disks to
your spares pool and, well, do you really trust them completely? I
wouldn't. This leaves either demoting them to some less important role
(if you have one that can use a potentially significant number of disks,
and maybe you do) or trying to return them to the vendor for a warranty
claim (and I don't know if the vendor will take them back under that
circumstance).
I don't have a good answer to this. Our current (new) approach is to
replace disks that have persistent read errors. On the first read error
we clear the error and schedule a pool scrub; if the disk then reports
more read errors (during the scrub, before the scrub, or in the next
while after the scrub), it gets replaced.
(This updates some of our past thinking on when to replace disks. The general discussion there is still valid.)
ZFSDiskReplacementWhen written at 22:24:52; Add Comment
How ZFS resilvering saved us
I've said nasty things about ZFS before and I'll undoubtedly say some in
the future, but today, for various reasons, I want to take the positive
side and talk about how ZFS has saved us. While there are a number of
ways that ZFS routinely saves us in the small, there's been one big
near miss that stands out.
Our fundamental environment is ZFS pools with
vdevs of mirror pairs of disks. This setup costs space but, among other
things, it's safe from multi-disk failures unless you lose both sides
of a single mirror pair (at which point you've lost a vdev and thus the
entire pool). One day we came very close to that: one side of a mirror
pair died more or less completely and then, as we were resilvering on
to a spare disk, the other side of the mirror started developing read
errors. This was especially bad because read errors generally had the
effect of locking up this particular fileserver (for reasons we don't
understand). This was particularly bad because in Solaris 10 update 8,
rebooting a locked up fileserver causes the pool resilver to lose all
progress to date and start again from scratch.
ZFS resilver saved us here in two ways. The obvious way is that it
didn't give up on the vdev when the second disk had some read errors.
Many RAID systems would have shrugged their shoulders, declared the
second disk bad too, and killed the RAID array (losing all data on it).
ZFS was both able and willing to be selective, declaring only specific
bits bad instead of ejecting the whole disk and destroying the pool.
(We were lucky in that no metadata was damaged, only file contents,
and we had all of the damaged files in backups.)
The subtle way is how ZFS let us solve the problem of successfully
resilvering the pool despite the fileserver's 'eventually lock up after
enough read errors' behavior. Because ZFS told us what the corrupt files
were when it found them and because ZFS only resilvers active data, we
could watch the pool's status during the resilver, see what files were
reported as having unrepairable problems, and then immediately delete
them; this effectively fenced the bad spots on the disk off from the
fileserver so that it wouldn't trip over them and explode (again). With
a traditional RAID system and a whole-device resync it would have been
basically impossible to fence the RAID resync away from the bad disk
blocks. At a minimum this would have made the resync take much, much
longer.
The whole experience was very nerve-wracking, because we knew we were
only one glitch away from ZFS destroying a very large pool. But in the
end ZFS got us through and we able to tell users that we had very strong
assurances that no other data had been damaged by the disk problems.
ZFSResilverSave written at 00:15:12; Add Comment
2013-04-19
How ZFS deals with 'advanced format' disks with 4 Kb physical sectors
These days it's very hard or impossible to buy new SATA disks that don't
have 4 Kb physical sectors. This makes
the question of how ZFS deals with them a very interesting one and I'm
afraid that the answer is 'not well'.
First, the high speed basics. All ZFS vdevs have an internal property
called 'ashift' (normally visible only through zdb) that sets the
fundamental block size that ZFS uses for that vdev (the actual value
is the power of two of that block size; a 512 byte block size is an
ashift of 9, a 4 KB one is an ashift of 12). The ashift value
for a new vdev is normally set based on the physical sector sizes
reported by the initial disk(s). The ashift for a vdev can't be
changed after the vdev is created and since vdevs can't be detached
from a pool, it's permanent after creation unless and until you destroy
the pool. Linux ZFS allows you to override the normal ashift with
a command line argument. Illumos ZFS only allows you to set the
low-level physical block size reported for disks (see here for details)
and thus indirectly control the ashift for new vdevs.
It turns out that the basic rule of what ZFS will allow and not
allow is you cannot add a disk to a vdev if it has a larger
physical sector size than the vdev's ashift. Note that this is
the physical sector size, not the logical sector size. In
concrete terms you cannot add a properly reporting 4K disk to an
existing old vdev made from 512 byte disks, including replacing
a 512b drive with a 4K drive. It doesn't matter to ZFS that the new
4K disk is still addressable in 512-byte sectors and it would work
if ZFS didn't know it was a 4K disk; ZFS will generously save you
from yourself and refuse to allow this. In practice this means
that existing pools will have to be destroyed and recreated when
you need to replace their current disks with 4K drives, unless you
can find some way to
lie to ZFS about the physical block size of the new disks.
(Sufficiently old versions of Solaris are different because they know
about ashift but do not know about physical sector sizes; they only
notice and know about logical sector sizes. The good news is that you
can replace your 512 byte disks with 4K disks and have things not
explode. The bad news is that there is no way to create new vdevs with
ashift=12.)
Since a 512b to 4K transition is probably inevitable in every disk
drive technology, you now want to create all new vdevs with
ashift=12.
A vdev created with at
least one 4K drive so that it gets an ashift of 12 can thereafter
freely mix 512b drives and 4K drives; as far as I know you can even
replace all of the 4K drives in it with 512b drives.
On Illumos the only way to do this is to set the
reported physical sector size of at least one disk in the new vdev
to 4K (if they aren't 4K disks already), at which point you become
unable to add them to existing pools created with 512-byte disks.
On old versions of Solaris (such as the Solaris 10 update 8 that
we're still running) this is impossible.
(The conflicting needs for disks to report as 4K sector drives or
512b sector drives depending on what you're doing with them is why
the Illumos 'solution' to this problem is flat out inadequate.)
The other issue is one of inherent default alignment in normal
operation. Many current filesystems will basically align almost all of
their activity on 4Kb or greater boundaries even if they think the disk
has 512b sectors, which means that they'll actually be issuing aligned
full block writes on 4K drives if the underlying partitions are properly
aligned. Unfortunately ZFS is not one of these filesystems. Even though
it normally writes a lot of data in 128 Kb records ZFS will routinely
do unaligned writes (even for these 128 Kb records), including writes
that start on odd (512b) block numbers. If you do mix a 4K physical
sector drive into your old vdevs in one way or another this means that
you'll be doing a lot of unaligned partial writes.
(The performance penalty of this will depend on your specific setup
and write load.)
I'm not particularly pleased by all of
this. From my perspective the ZFS developers have done a quite good
job of destroying long term storage management under ZFS because as
we turn over our disk stock we're going to be essentially forced to
destroy and recreate terabytes of pools with all of the attendant user
disruption. With more planning and flexibility on the part of ZFS this
could have been a completely user-transparent non-issue. As it is,
forcing us to migrate data due to a drive technology change is the exact
opposite of painless long term storage management.
Disclaimer: this is primarily tested on current versions of Illumos,
specifically OmniOS. It's possible that ZFS on Linux or Solaris 11
behave differently and more sensibly, allowing you to replace 512b
disks with 4K disks and so on. Commentary is welcome.
(All of these bits of information are documented or semi-documented on
various web pages and mailing list threads around the Internet but I
couldn't find them all in one place and I couldn't find anything that
definitively and explicitly documented how 4K and 512b disks interacted
with vdevs with various ashift settings.)
Sidebar: what ZFS should do
Three things immediately and two over the longer range:
- allow 4K disks with a 512b logical sector size to be added to
existing
ashift=9 vdevs. Possibly this should require a 'force'
flag and some sort of warning message. Note that this is already
possible if you make the disk lie to ZFS; the only thing this flag
does is remove the need for the lies.
- create all new vdevs with
ashift=12 by default, because this is
the future-proof option, and provide a flag to turn this off for
people who really absolutely need to do this for some reason.
- allow people to specify the
ashift explicitly during vdev
creation. Ideally there would be a pool default ashift (or
the default ashift for all new vdevs in a pool should be the
largest ashift on an existing vdev).
- change the block allocator so that even on
ashift=9 pools as
much as possible is kept aligned on 4Kb boundaries.
- generalize this to create a new settable vdev or pool property
for the preferred alignment. This would be useful well beyond 4K
disks; for example, SSDs often internally have large erase block
sizes and are much happier with you if you write full blocks to
them.
(Some of this work may already be going on in the ZFS world, especially
things that would help SSDs.)
ZFS4KSectorDisks written at 15:10:13; Add Comment
2013-04-11
Something I'd like to be easier in Solaris's IPS
IPS is the 'Image Packaging System', which seems to be essentially the
default packaging system for Illumos distributions. Or at least it's
the packaging system for several of them, most importantly OmniOS, and Oracle's Solaris 11,
if you care about the latter. IPS is in some ways very clever and nifty
but as a sysadmin there are some bits I wish it did differently, or at
least easier. Particularly I wish that it made it easier to download and
archive complete packages.
You may be wondering how a package system can possibly make that hard.
I'm glad you asked. You see, IPS is not a traditional package system; if
you want an extremely crude simplification it's more like git. In this
git-like approach, the files for all packages are stored together in a
hash-based content store and 'packages' are mostly just indexes of what
hash identifier goes where with what permissions et al. This has various
nominal advantages but also has the drawback that there is no simple
package blob to download, the way there is in other packaging formats.
There are two related ways to get copies of IPS packages for yourself,
both using the low-level pkgrecv command (instead of the higher-level
pkg command). The most obvious way is to have pkgrecv just write
things out into a pkg(5) file ('pkgrecv -a -d ...'). The drawback
of this is that it really does write out everything it downloaded to
a single file. This is fine if you're just downloading one package but
it's not so great if you're using the -r switch to have pkgrecv
download a package and its dependencies. The more complex way is to
actually create your own local repo (which is a directory tree) with
'pkgrepo create /your/dir', then use pkgrecv (without -a) to
download packages into that repo. This gives you everything you want at
the cost of, well, having that repo instead of simple package files that
you can easily copy around separately and so on.
(Both pkgrecv variants also have the drawback that you have to give
them an explicit repository URL. Among other things this makes it hard
to deal with cross-repository dependencies, for example if a package
from an additional repository needs some new packages from the core
distribution repo.)
What I'd like is a high-level pkg command (or a command option) that
handled all of this complexity for me and wrote out separate pkg(5)
files for each separate package.
(In theory I could do this with a shell script if various pkg
subcommands had stable and sufficiently machine-parseable output.
I haven't looked into pkg enough to know if it does; right now
I'm at the point where I'm just poking around OmniOS.)
Sidebar: why sysadmins care about getting copies of packages
The simple answer is because sometimes we want to be able to (re)build
exact copies of some system, not 'the system but with some or all of
the packages updated to current versions'. We also don't want to have
to depend on a remote package source staying in operation or keeping
those packages around for us, because we've seen package sources go away
(or decide that they need to clean up before their disk space usage
explodes).
IPSPackageDownload written at 01:05:07; Add Comment
2013-04-08
Why ZFS still needs an equivalent of fsck
One of the things that is more or less a FAQ in ZFS circles is why ZFS
doesn't need an equivalent of fsck and why people asking for it are
wrong. Unfortunately, the ZFS people making that argument are, in the
end, wrong because they have not fully understood the purpose of fsck.
Fsck has two meta-purposes (as opposed to its direct purposes). The obvious one is checking and repairing
filesystem consistency when the filesystem gets itself into an
inconsistent state due to sudden power failure or the like; this is the
traditional Unix use of fsck. As lots of people will tell you, ZFS
doesn't need an external tool to do this because it is all built in.
ZFS even does traditional fsck one better in that it can safely do the
equivalent of periodic precautionary fscks in normal operation, by
scrubbing the pool.
(Our ZFS pools are scrubbed regularly and thus are far more solidly
intact than traditional filesystems are.)
The less obvious meta-purpose of fsck is putting as much of
your filesystem as possible back together when things explode
badly. ZFS manifestly needs something to do this job because
there are any number of situations today where ZFS will simply throw
up its hands and say 'it sucks to be you, I'm done here'. This is
not really solvable in ZFS either, because you really can't put
this sort of serious recovery mechanisms into the normal kernel
filesystem layer; in many cases they would involve going to extreme
lengths and violating the guarantees normally provided by ZFS (cf). This means external user-level
tools.
(zdb does not qualify here because it is too low-level a tool. The
goal of fsck-level tools for disaster recovery is to give you a
relatively hands-off experience and zdb is anything but hands-off.)
PS: despite this logic I don't expect ZFS to ever get such a tool.
Writing it would be a lot of work, probably would not be popular with
ZFS people, and telling people 'restore from your backups' is much
simpler and more popular. And if they don't have (current) backups,
well, that's not ZFS's problem is it.
(As usual that is the wrong answer.)
ZFSWhyFsck written at 01:30:08; Add Comment
2013-03-29
Illumos-based distributions are currently not fully mature
As a sysadmin I'm used to my Unixes having certain amenities and
conveniences. I've come to accept that any non-hobbyist Unix
distribution that wants to be taken seriously (especially a free one)
will have things like an announcements or security updates mailing list,
a bug tracker, at least a somewhat visible security contact point,
and documentation about all of this (along with things like how often
security updates are made for any particular release and indeed the
release policy). Some form of signed or verified packages are state
of the art, along with the key infrastructure to support them.
While some of the various Illumos distributions are clearly
hobbyist projects that you can't expect this from, some are equally
clearly aspiring to be larger than that (swank websites are one sign
of this). But, well, they don't seem to have pretty much any of these
amenities that I'm used to. Does this matter or am I being too picky? I
think that it does.
(A certain number of the pretty websites started looking a bit bare
once I started following links.)
The surface reason is that these things are important for running
production systems; for example, I'd really like to know about security
fixes as soon as they're available for the obvious reason (we might
not apply them, but at least
we can assess the severity). The deeper reason is what the omission of
these things says to me about the distribution's current audience. To
put it one way, none of these things are needed by insiders who are
deeply involved in the distribution already; they know the security
update practices, they follow the main mailing lists, and so on. All
of the documentation and so on is for new people, for outsiders like
me, and the less it exists the more it feels like the distribution is
not yet mature enough to be sensibly used by outsiders like me.
(There are some bits of this infrastructure that you may want to think
about carefully beforehand, like bug trackers. But announce mailing
lists are trivial.)
I'm sure that all of this will change in time, at least for the Illumos
distributions that want to be used by outsiders like me. But right now
I can't help but feel that Illumos distributions are not yet fully
mature and up to the level of FreeBSD and modern Linux distributions
(regardless of what the quality of the underlying OS is).
IllumosImmature written at 01:27:19; Add Comment
2013-03-26
Reconsidering a ZFS root filesystem
A Twitter conversation from today:
@thatcks: Let's see if
fsck can heal this Solaris machine or if I get to reinstall it from
scratch. (Thanks to ILOMs I can do this from my desk.)
@bdha:
fsck? Solaris? Sadface.
@thatcks: I have horror
stories of corrupted zpool.cache files too. I don't know if you can
boot a ZFS-root machine in that situation.
@bdha: I've been there.
zpool.cache backups saved my ass.
Right now all of our Solaris fileservers have
(mirrored) UFS root filesystems instead of ZFS root filesystems and
in the past I've expressed some desire to see that continue in any
future ZFS fileservers we built. I've written about why before; the short version is that I've seen situations where
/etc/zfs/zpool.cache had to be deleted and recreated, and I'm not sure
this is even possible if your root filesystem is a ZFS filesystem.
Using UFS for root filesystems avoids this chicken and egg problem.
(Of course the whole situation around zpool.cache and ZFS pool
activation is a little bit mysterious, at least
in Solaris.)
Well, actually, that's not the only reason. The other reason is that
I still think of ZFS as fragile, as something
that will go from 'fine' to 'panics your system' under remarkably little
provocation. UFS is much more old-fashioned and will soldier on even
under relatively extreme circumstances (whether that's a wise idea is
another question). Under most circumstances I would rather have our
fileservers limping along than dead (even if the entire root filesystem
becomes inaccessible, as an extreme example).
But all of this is basically supposition (and thus superstition). UFS
certainly has its own problems (one of which I ran into today on our
test server) and I've never actually tried out a modern Illumos-based
system with ZFS root, both in normal operations and if I deliberately
start breaking stuff (and I certainly hope that some of the problems
I heard about years ago have been dealt with). It may well turn out
that ZFS root based systems are easier to deal with and recover than
I expect. They certainly have their own benefits (periodic scrubs are
reassuring, for example).
(And to be honest, I think it's quite possible that Illumos will only
really well support a ZFS root by the time we get to it. It's clear that
ZFS root is where all of the enthusiasm is and where most people think
we should be going.)
PS: root filesystem snapshot and snapshot rollback are not particularly
an advantage in our particular environment, since we basically don't
patch our fileservers. Of course periodic snapshots might save us in the
face of a corrupt zpool.cache in the live filesystem.
ZFSRootReconsidered written at 22:57:18; Add Comment
2013-02-26
Thinking about how much Solaris 11 is worth to us
As a result of some feedback I've gotten on earlier entries I've wound up thinking about what I'll
summarize as how much Solaris 11 is worth to us, ie what we might pay
for it. To start with, is it worth anything at all?
My answer is 'yes, under the right circumstances' (one of those
circumstances being that we get source code). Despite what I've
said in the past about Illumos and FreeBSD, Solaris 11 is still in many ways the least risky
option for us. It's not perfect but to put it one way it's the devil we
know. I still have uncertainties about Oracle's actual commitment to it
but then I have the same issues with Illumos.
So, how much would we pay for Solaris 11? Unfortunately I think the
answer to that is 'not very much'. It's not zero (we've paid for Solaris
before) but our actual budget is not very big and the direct benefits
to using Solaris 11 are only moderate. My guess is that $100 a server a
year would be acceptable (call it $1000 a year total), $200/server/year
would be at best marginal, and more than that is really unlikely. It'd
be very hard to argue that using Solaris 11 over a carefully validated
FreeBSD configuration would be worth $2k/year.
(To put it one way, the larger the amount of money involved the more it
looks like we (the sysadmins) are trying to just spend money instead of
taking the time to do our job to carefully build a working environment.
It would be one thing if the alternatives were clearly incapable and
Solaris 11 was the only choice, but they're not and it isn't. Given
the university attitude on staff time,
we can't even argue that the savings in staff time are worth the
expense.)
PS: the question of whether Oracle would give us either Solaris 11
source code or prices anywhere near this low is an entirely different
matter. My personal expectation is that either issue would be met with
the polite version of hysterical laughter, given that comparatively
speaking we're an insignificant flyspeck.
Solaris11Worth written at 21:17:44; Add Comment
2013-02-18
The strikes against Solaris 11 for us
A commentator on my entry thinking about FreeBSD for future ZFS-based
fileservers left a comment that contains any
number of things that I want to react to.
You shouldn't necessarily draw your conclusions in regards to Illumos
or Solaris 11 based on your experiences with Solaris 10. S10 is very
old and much behind S11 and Illumos, especially when it comes to ZFS.
I've written about our view on ZFS features before, although focused mostly on later versions of
Solaris 10. The quick version is that I still can't see any new ZFS
features that are especially enticing to us. It is vaguely possible that
Solaris 11 ZFS contains bug fixes for issues that we might encounter in
the future, but we certainly encountering any serious issues today so
this is not very compelling. Not when set against the other costs.
Since this is long, I am going to give you my summary view of the other
costs up front. They are no Solaris source code, that we have to trust
Oracle to keep a licensing model I don't think they're very enthused
about, and that it costs anywhere between $7k/year and $20k/year and up
(depending on just what fileservers we wind up with).
(For that matter, we have to trust Oracle to keep going with Solaris at
all. I am far from convinced about this; Oracle is relatively ruthless
about things that do not make them good money, good Solaris development
is expensive, and I do not see how Solaris makes Oracle much money
especially over the long term.)
The very first cost is that moving to Solaris 11 means no
more source code access because Solaris 11 is closed source. This is a very big issue for us. No source
code makes DTrace almost useless to us (cf) and
DTrace was very important for solving a recent major performance
issue. Our ZFS spares system also relies crucially on
being able to extract and interpret non-public ZFS information (because
we have no real choice; the information we need is not available through
public interfaces).
Now, if you really care and value your data I would suggest to look at
Solaris 11. You can run it on 3rd party x86 hardware with full support
for relatively little money - $1k per CPU socket per year.
I have many reactions to this. One of them is that I completely reject
the idea that Solaris 11 is the only right choice if we 'really care and
value [our] data'. Paying money for something does not make it either
good or better than the alternatives; if anything, my experience has been
the exact opposite.
In addition there is a major issue here, which is that this approach
requires extending a significant amount of trust to Oracle. What happens
if in two years Oracle decides that this licensing scheme was a bad
idea and withdraws it, effective immediately (or just significantly
increases prices)? Don't say it can't or won't happen; Oracle has made
similar abrupt changes in Solaris licensing before. My personal view is
that this is especially likely to happen because a $1k per CPU socket
price is not something that I think of as attractive. I don't think
that Oracle actually wants people to use this program, which makes it
especially likely to change or disappear and thus dangerous.
For file servers a modern 2-socket server is going to be an overkill,
so Solaris would cost you $2k per year per server - this is not really
that much.
One way to put my reaction to this statement is that it shows the vast
gulf between (some sorts of) commercial businesses and an academic
environment. In an academic environment such as mine, $2k/server/year is
a very big sum of money; it is more than it would cost to replace the
server outright every year. We do not have (at current server usage)
even $7k/year to pay for Solaris 11 licenses, much less something like
$20k/year (for ten dual-socket fileservers, if our environment expands).
The only way we could even start trying to justify and get $1k/year per
server for software is if the software did something truly amazing and
essential. Solaris 11 does not qualify. If our only options were to pay
$1k/year per server or abandoning ZFS entirely, I'm pretty sure that we
would be abandoning ZFS. Certainly in an argument between FreeBSD (free,
we get source code, etc) and Solaris 11 ($7k+/year, no source code, etc),
I do not think I could possibly successfully defend Solaris 11.
It's worth noting one subtle effect of a per-year, per-fileserver
licensing cost: it makes expanding our environment much more expensive.
In a non Solaris 11 world we could add more fileservers for just the
hardware costs and those are relatively low (especially if we (re)use
servers that we already have). If we licensed Solaris 11 any new
fileserver would be a $1k-$2k/year cost over and above the raw hardware
cost. This would probably mean no new fileservers.
This would give you peace of mind for the next 5 years you mentioned,
commercial support, access to security updates and bug fixes, etc.
I will condense my reactions to this to just saying that our experiences
with Sun on all of these measures has not been all that good and I have
no reason to assume that Oracle will provide any better experiences than
Sun did. In practice I assume that commercial vendors will provide us no
support and perhaps some security updates, regardless of what they are
nominally paid to do.
I have additional reactions to other bits of the comment but they are
not really about Solaris 11 as such, so I think I will stop this entry
here.
StrikesAgainstSolaris11 written at 01:04:07; Add Comment
2013-01-30
Thinking about FreeBSD versus Illumos for our ZFS fileservers
One of our options as a replacement for Solaris 10
is to switch from Solaris to FreeBSD (now that I no longer believe you
need Solaris for ZFS). There are both advantages and
uncertainties to such a move and today I want to ramble on a bit about
how I see both sides.
On the one hand, we have no particular attraction to Solaris as Solaris.
On the other hand, Solaris is the devil we know and this particular
devil works pretty well for us (plus I've come around to the idea that
DTrace is kind of useful). Using some version
of Illumos preserves what we have now while allowing us to use new
hardware; we'd also get various improvements in things like package
management.
There are two or three drawbacks of Illumos: picking the right
distribution, the long term development path, and potentially hardware
support. The right distribution is not just a matter of what is good
today; given that we'll likely be running these machines for five years
or more, we care about the long term viability of the distribution and
ideally the continued availability of security updates for old versions.
The question of long term development in general is, well, is Illumos
going to survive five years or more in useful form (ie, something that
will run on a generic server as a generic server OS) or is it going to
wither away into at best a narrowly specialized thing? It takes a bunch
of work to keep developing a general server OS and there are already lots
of other things to drain away potential contributors.
I had to talk about Illumos's drawbacks because FreeBSD is the flipside
of them. With FreeBSD we get what is for us a new and untested platform
but it has a lot of momentum and history behind it, the policies
it operates under are clear, and it seems clearly supported well
into the future (since FreeBSD is basically the non-Linux Unix). FreeBSD also has the advantage of dropping a
lot of things about Solaris that I don't like in favour of a bunch of
well-proven technology and plain modern stuff that I am much happier
with.
In short, FreeBSD gives us a more attractive overall system at the cost
of some uncertainty over the pieces that we really care about. It also
seems likely that FreeBSD DTrace is less mature than Illumos DTrace and
as mentioned I sort of care about DTrace these
days. Of course this uncertainty can be somewhat mitigated with testing
and other people's experiences.
(It's not just an issue of things like normal functionality and
performance, although those matter. We also care about the dark corners,
which you can't test and you sort of have to take on either trust or
painful experience. Illumos lets us give more weight to some of our
painful Solaris experience, since it's likely that Illumos is going to
be very Solaris-like in many ways.)
PS: I consider it a feature that FreeBSD can be installed with non-ZFS
root filesystems. Given some of the ZFS failure modes I've seen I
actively prefer not to need ZFS to boot the machine. I'm not sure how
many Illumos derived distributions still support this (if any); my
impression was that the Solaris world was going full steam ahead to an
all-ZFS future.
ZFSFreeBSDvsIllumos written at 00:20:35; Add Comment
|
These are my WanderingThoughts
(About the blog)
GettingAround
Full index of entries
Recent comments
This is part of CSpace, and is written by ChrisSiebenmann.
Twitter: @thatcks
* * *
Atom feeds are available; see the bottom of most pages.
This is a DWiki.
(Help)
Categories: links, linux, programming, python, snark, solaris, spam, sysadmin, tech, unix, web
|