Wandering Thoughts archives


Things that limit the performance of hardware acceleration

Suppose that you have an infinitely fast hardware accelerator, one that can compute something of interest in no time at all. What external issues limit the total performance advantage that you can get by putting this hardware accelerator in a system?

I can think of the following limiters:

  • main memory speed limits, the latency and bandwidth limits of system RAM. This limits how fast you can interact with system memory.

  • the speed limits of the underlying hardware that you're talking to, if you are. For example, hardware RAID cannot go faster (over the long term) than the speed of the underlying disks, and anything that talks to a network is limited by the network's latency and bandwidth constraints.

  • the setup and transaction costs for passing commands and data between you and the CPU. For instance, how many PCI reads and writes does it take to tell your hardware acceleration to do something, or to determine its status?

    (When thinking about this, it's important to also consider the speed impacts of any necessary memory barriers.)

  • some sorts of interrupts, and in general any need for CPU involvement and decisions in your actions. Having to wait for CPU involvement is effectively a pipeline stall in your processing, with all of what you'd expect from that.

    (Interrupts are not necessarily a performance limit by themselves, since they may just be notification to the CPU that it can pay attention to you. They generally will incur transaction costs, though.)

My impression is that a lot of the increasing sophistication of hardware in general has been driven by reducing the transaction costs of operations, starting with DMA and moving upwards from there. There once was a day when the OS poked a bunch of control registers for each operation; these days, the OS writes all of that information to control blocks in memory, then pokes the hardware once to point it at the control blocks.

tech/HardwareAccelerationPerfLimits written at 00:51:58; Add Comment

Page tools: See As Normal.
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.