Suppose, not entirely hypothetically, that you want to test how well
some new storage hardware and disks stand up to a lot of write cache
flush operations (eg); you don't
care about high level filesystem operations, you just want to hammer on
the disks and the disk interconnects.
I will cut to the chase: the simplest and most direct way of doing
this on Linux is to call
fsync() on a (or the) disk block device.
This appears to always generate a
(or the SATA equivalent), although this will be a no-op in the disk
if there is no cached writes.
(The one exception is that the Linux SCSI layer does not issue cache
synchronization operations unless it thinks that the disk's write cache
is enabled. Since SATA disks go through the SCSI layer these days, this
applies to SATA drives too.)
fsync() on files on filesystems can issue write cache flush
operations under at least some circumstances, depending on the exact
filesystem. However I can't follow the ext3 and ext4 code clearly
enough to be completely sure that they always flush the write cache on
fsync() one way or another, although I suspect that they do. In any
case I generally prefer testing low-level disk performance using the
raw block devices.
(It appears that under at least some circumstances, calling
on things on extN filesystems will not directly flush the disk's write
cache but will instead simply issue writes that are intended to bypass
it. These may then get translated into write flushes on disks that don't
support such a write bypass.)
Sidebar: Where this lives in the current 3.12.0-rc6 code
Since I went digging for this in the kernel source and would hate to have
not written it down if I ever need it again:
- blkdev_fsync() in fs/block_dev.c calls blkdev_issue_flush()
and is what handles
fsync() on block devices.
- blkdev_issue_flush() in block/blk-flush.c issues a BIO operation
WRITE_FLUSH is a bitmap of BIO flags, including
- sd_prep_fn() in drivers/scsi/sd.c catches
and calls scsi_setup_flush_cmnd(), which sets the SCSI command to
- SATA disks translate
SYNCHRONIZE_CACHE into either
ATA_CMD_FLUSH_EXT in ata_get_xlat_func() and
ata_scsi_flush_xlat() in drivers/ata/libata-scsi.c.
Whether or not the disk has write cache enabled comes into this through
the SCSI layer:
- sd_revalidate_disk() in drivers/scsi/sd.c configures the general
block layer flush settings for the particular device based on whether
the device has WCE and possibly supports FUA
to support bypassing the write cache (see FUA in here). This is done by
- blk_queue_flush() in block/blk-settings.c sets the request queue's
flush_flags field to the value passed in.
- generic_make_request_checks() in block/blk-core.c filters out
flush flags and flush requests for queues that did not advertise
them, ie any 'SCSI' drive that didn't advertise at least WCE.
For a given SCSI drive, the state of WCE is reported in the sysfs
cache_type and the state of FUA in, surprise,
This includes SATA drives (which are handled as SCSI drives, more or
For more kernel internal details, see this bit of kernel documentation.
This may be useful to understand the interaction of various bits and
pieces in the source code I've inventoried above.
By the way,
git grep turns out to be really handy for this sort
PS: I don't know if there's a straightforward way to force FUA writes.
You'd expect that
O_SYNC writes would do it but I can't prove it
from my reading of kernel source code so far, although I haven't dug
deeply on this.