ZFS pool import needs much better error messages
One of the frustrating things about dealing with sufficiently damaged
ZFS pools is that 'zpool import
' and friends do not generate very
detailed error messages. There are a lot of things that can go wrong
with a ZFS pool that will make it not importable, but 'zpool
import
' has clear explanations for only some of them. For many others
all you get is a generic error in 'zpool import
' status reporting
of, say:
The pool cannot be imported due to damaged devices or data.
(Here I'm talking about the results of just running 'zpool import
'
to see available pools and their states and configuration, not
trying to actually import a pool. Here zpool
has lots of room to
write explicit and detailed messages about what seems to be wrong
with your pool's configuration.)
This isn't just an issue of annoying and frustrating people with
opaque, generic error messages. Given that the error messages are
generic, it's quite easy for people to focus only on the obvious
problems that zpool import
reports, even if those problems may
not be the reason the pool can't be imported. As it happens I have
a great example of this in action, in this SuperUser question.
When you read this question, can you figure out what's wrong? Both
the SuperUser ZFS community and the ZFS on Linux mailing list
couldn't.
(I believe that everything you need to figure out what's going on
is actually in the information in the question and the code behind
'zpool import
' actually knows what the problem is. This assumes
that my diagnosis
is correct, of course.)
Perhaps zpool import
should not be fully verbose by default, as
there's a certain amount of information that may only make sense
to people who know a fair bit about how ZFS works. But it certainly
should be possible to get this information with, eg, a verbose
switch instead of having to reverse engineer it from zdb
output.
If nothing else, this means that you can get a verbose report and
show it to ZFS exports in the hope that they can tell you what's
wrong.
On a purely pragmatic level I think that zpool import
should be
really verbose and detailed when a pool can't be imported. 'My pool
won't import' is one of the most stressful experiences you can have
with ZFS; to get unclear, generic errors at this point is extremely
frustrating and does not help one's mood in the least. This is
exactly the time when large amounts of detail are really, really
appreciated, even if they're telling you exactly how far up the
creek you are.
(This means that I would very much like a 'zpool import -v <pool>
'
option that describes exactly what the import is doing or trying
to do and then covers all of the problems that it detected with the
pool configuration, all the things the kernel said to it, and so
on. A report of 'I am asking the kernel to import a pool made up
of the following devices in the following vdev structure' is not
too verbose.)
PS: while this example is from ZFS on Linux and FreeBSD, I've looked at the current Illumos code for zpool and libzfs, and as far as I can see it would have exactly the same problem here.
(Part of the issue is that zpool import
and libzfs have what you
could call less than ideal reporting if a pool is marked as active
on some other system and also has configuration problems. But even
if it reported multiple errors I think that the real problem here
would remain obscure; the current 'zpool import
' code appears to
deliberately suppress printing out parts of the information necessary.)
|
|