Wandering Thoughts archives

2019-03-24

A bit more on ZFS's per-pool performance statistics

In my entry on ZFS's per-pool stats, I said:

In terms of Linux disk IO stats, the *time stats are the equivalent of the use stat, and the *lentime stats are the equivalent of the aveq field. There is no equivalent of the Linux ruse or wuse fields, ie no field that gives you the total time taken by all completed 'wait' or 'run' IO. I think that there's ways to calculate much of the same information you can get for Linux disk IO from what ZFS (k)stats give you, but that's another entry.

The discussion of the *lentime stats in the manpage and the relevant header file are very complicated and abstruse. I am sure they make sense to people for whom the phrase 'a Rieman sum' is perfectly natural, but I am not such a person.

Having ground through a certain amount of arguments with myself and experimentation, I now believe that the ZFS *lentime stats are functionally equivalent to the Linux ruse and wuse fields. They are not quite identical, but you can use them to make the same sorts of calculations that you can for Linux. In particular, I believe that an almost completely accurate value for the average service time for ZFS pool IOs is:

avgtime = (rlentime + wlentime) / (reads + writes)

The important difference between the ZFS *lentime metrics and Linux's ruse and wuse is that Linux's times include only completed IOs, while the ZFS numbers also include the running time for currently outstanding IOs (which are not counted in reads and writes). However, much of the time this is only going to be a small difference and so the 'average service time' you calculate will be almost completely right. This is especially true if you're doing this over a relatively long time span compared to the actual typical service time, and if there's been lots of IO over that time.

When there is an error, you're going to get an average service time that is higher than it really should be. This is not a terribly bad problem; it's at least not hiding issues by appearing too low.

ZFSPerPoolStatsII written at 00:49:53; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.