Some notes on using Solaris kstat(s) in a program
Solaris (and Illumos, OmniOS, etc) has for a long time had a 'kstat'
system for systematically providing and exporting kernel statistics
to user programs. Like many such systems in many OSes, kstat doesn't
need root permissions; all or almost all of the kstats are public
and can be read by anyone. If you're a normal Solaris sysadmin,
you've mostly interacted with this system via kstat(1)
(as I
have) or perhaps Perl, for which there
is the Sun::Solaris::Kstat module. Due to me not wanting to write
Perl, I opted to do it the hard way; I wrote a program that talks
more or less directly to the C kstat library. When you do this, you
are directly exposed to some kstat concepts that kstat(1)
and the
Perl bindings normally hide from you.
The stats that kstat
shows you are normally given a four element
name of the form module:instance:name:statistic. This is actually
kind of a lie. A 'kstat' itself is the module:instance:name triplet,
and is a handle for a bundle of related statistics (for example,
all of the per-link network statistics exposed by a particular
network interface). When you work at the C library level, getting
a statistic is a three level process; you get a handle to the kstat,
you make sure the kstat has loaded the data for its statistics, and
then you can read out the actual statistic (how you do this depends
on what type of kstat you have).
This arrangement makes sense to a system programmer, because if we
peek behind the scenes we can see a two stage interaction with the
kernel. When you call kstat_open()
to start talking with the
library, the library loads the index of all of the available kstats
from the kernel into your process but it doesn't actually retrieve
any data for them from the kernel. You only take the relatively
expensive operation of copying some kstat data from the kernel to
user space when the user asks for it. Since there are a huge number
of kstats, this cost saving is quite important.
(For example, a random OmniOS machine here has 3,627 kstats right now. How many you have will vary depending on how many network interfaces, disks, CPUs, ZFS pools, and so on there are in your system.)
A kstat can have its statistics data in several different forms.
The most common form is a 'named' kstat, where the statistics are
in a list of name=value struct
s. If you're dealing with this sort
of kstat, you can look up a specific named statistic in the data
with kstat_data_lookup
() or just go through the whole list
manually (it's fully introspectable). The next most common form is
I/O statistics, where the data is simply a C struct
(a
kstat_io_t
). There are also a few kstats that return other
struct
s as 'raw data', but to find and understand them you get
to read the kstat(1)
source code. Only named-data kstats really
have statistics with names; everyone else really just has struct
fields.
(kstat(1)
and the Perl module hide this complexity from you by
basically pretending that everything is a named-data kstat.)
Once read from the kernel, kstat statistics data does not automatically
update. Instead it's up to you to update it whenever you want to,
by calling kstat_read()
on the relevant kstat again. What
happens to the kstat's old data is indeterminate, but I think that
you should assume it's been freed and is no longer something you
should try to look at.
This brings us to the issue of how the kstat library manages its
memory and what bits of memory may change out from underneath you
when, which is especially relevant if you're doing something
complicated while talking to it (as I am). I believe the answer
is that kstat_read()
changes the statistics data for a kstat
and may reallocate it, kstat_chain_update()
may cause random
kstat struct
s and their data to be freed out from underneath you,
and kstat_close()
obviously frees and destroys everything.
(The Perl Kstat module has relatively complicated handling of its shadow references to kstat structs after it updates the kstats chain. My overall reaction is 'there are dragons here' and I would not try to hold references to any old kstats after a kstat chain update. Restarting all your lookups from scratch is perhaps a bit less efficient, but it's sure to be safe.)
In general, once I slowly wrapped my mind around what the kstat library was doing I found it reasonably pleasant to use. As with the nvpair library, the hard part was understanding the fundamental ideas in operation. Part of this was (and is) a terminology issue; 'kstat' and 'kstats' are in common use as labels for what I'm calling a kstat statistic here, which makes it easy to get confused about what is what.
(I personally think that this is an unfortunate name choice, since 'kstat' is an extremely attractive name for the actual statistics. Life would be easier if kstats were called 'kstat bundles' or something.)
|
|