Wandering Thoughts archives


Some more notes on Linux's ionice and kernel IO priorities

In the long ago past, Linux gained some support for block IO priorities, with some limitations that I noticed the first time I looked into this. These days the Linux kernel has support for more IO scheduling and limitations, for example in cgroups v2 and its IO controller. However ionice is still there and now I want to note some more things, since I just looked at ionice again (for reasons outside the scope of this entry).

First, ionice and the IO priorities it sets are specifically only for read IO and synchronous write IO, per ioprio_set(2) (this is the underlying system call that ionice uses to set priorities). This is reasonable, since IO priorities are attached to processes and asynchronous write IO is generally actually issued by completely different kernel tasks and in situations where the urgency of doing the write is unrelated to the IO priority of the process that originally did the write. This is a somewhat unfortunate limitation since often it's write IO that is the slowest thing and the source of the largest impacts on overall performance.

IO priorities are only effective with some Linux kernel IO schedulers, such as BFQ. For obvious reasons they aren't effective with the 'none' scheduler, which is also the default scheduler for NVMe drives. I'm (still) unable to tell if IO priorities work if you're using software RAID instead of sitting your (supported) filesystem directly on top of a SATA, SAS, or NVMe disk. I believe that IO priorities are unlikely to work with ZFS, partly because ZFS often issues read IOs through its own kernel threads instead of directly from your process and those kernel threads probably aren't trying to copy around IO priorities.

Even if they pass through software RAID, IO priorities apply at the level of disk devices (of course). This means that each side of a software RAID mirror will do IO priorities only 'locally', for IO issued to it, and I don't believe there will be any global priorities for read IO to the overall software RAID mirror. I don't know if this will matter in practice. Since IO priorities only apply to disks, they obviously don't apply (on the NFS client) to NFS read IO. Similarly, IO priorities don't apply to data read from the kernel's buffer/page caches, since this data is already in RAM and doesn't need to be read from disk. This can give you an ionice'd program that is still 'reading' lots of data (and that data will be less likely to be evicted from kernel caches).

Since we mostly use some combination of software RAID, ZFS, and NFS, I don't think ionice and IO priorities are likely to be of much use for us. If we want to limit the impact a program's IO has on the rest of the system, we need different measures.

IoniceNotesII written at 23:03:23; Add Comment

Page tools: See As Normal.
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.