Two ends of hardware acceleration
One of the ways that you can categorize hardware acceleration is to say that there is a continuum between two sorts of hardware acceleration: doing something instead of having the CPU do it, and doing something that the CPU can't even come close to doing fast enough.
If you have a choice, you obviously want to be on the latter end of the scale. Ideally you'll have a fairly solid proof that the best software implementation on a general CPU can't possibly be fast enough because it requires very hard hardware performance (very low-latency access to a lot of memory, for example, as you might get with extensive lookup tables). This gives you a reasonable amount of confidence that Moore's Law as applied to general CPU performance is not about to eat your lunch in a few years.
Life at the other end of the scale is much more difficult, because you run into the hardware RAID problem, namely that you need to find people for whom the problem is important and who are also CPU constrained at the same time. (It is a tragic mistake to merely find people with your problem; to put it one way, there are a lot more people with slow disks than people who will pay much money to speed them up.)
On a side note, sometimes doing it instead of the CPU can be a sales pitch in its own right, but you have to be in a special circumstance. The best example of this is hardware cryptographic modules for signing things, where the attraction is that the CPU (and its buggy, vulnerable software) gets nowhere near your signing keys.
What I think about why graphics cards keep being successful
Graphics cards are the single most pervasive and successful sort of hardware accelerator in the computer world; they are a shining exception to how hardware acceleration has generally been bad. Given my views, I'm interested in figuring out why graphics cards are such an exception.
Here's my current thinking on why graphics cards work, in point form (and in no particular order):
- avid users (ie, gamers) are CPU constrained during operation as well as graphics constrained.
- avid users will pay significant amounts of money for graphics cards,
and will do so on a regular basis.
- there is essentially no maximum useful performance limit; so far,
people and programs can always use more graphics power.
- GPUs have found various ways of going significantly faster than the
CPU, ways that the CPU currently cannot match, including:
- significant parts of the problem they're addressing is naturally (and often embarrassingly) parallel; this makes it relatively simple to speed things up by just throwing more circuitry at the problem.
- they have almost always used high speed memory interfaces (or highly parallel ones), getting around the memory speed performance limit.
- while GPUs have problems with the costs of having the CPU actually
talk to them, they have found a number of ways to amortize that
overhead and work around it.
(For example, these days you rarely do individual graphics operations one by one; instead you batch them up and do them in bulk.)
- GPU vendors are successful enough to spend a lot of money on hardware design.
- GPU vendors iterate products rapidly, often faster than CPU vendors.
I think that many of these reasons can be inverted to explain why hardware acceleration is a hard problem, but that's another entry.