The importance of numerical literacy
Via Slashdot comes a report of a new biometric airport security system for detecting 'hostile intent' (their term). According to the Wall Street Journal article:
In the latest Israeli trial, the system caught 85% of the role-acting terrorists, meaning that 15% got through, and incorrectly identified 8% of innocent travelers as potential threats, according to corporate marketing materials.
The company's goal is to prove it can catch at least 90% of potential saboteurs -- a 10% false-negative rate -- while inconveniencing just 4% of innocent travelers.
Sounds good, doesn't it?
Actually, it's useless in practice. Ask yourself this question: if someone is flagged by the machine, what are the odds that they're a real terrorist? The answer turns out to be 'really, really low'; the high accuracy at catching real terrorists is utterly dwarfed by even the 4% of innocent people caught up that is their goal, unless there are a lot of terrorists flying.
In summary: any time what you're testing for is rare, anything more than a microscopic false positive rate is going to swamp you with noise. (This effect is also commonly seen in medical tests.)
This makes a nice illustration of the importance of numerical literacy. On first blush this system sounds effective; missing only 10% of the terrorists and 'inconveniencing just 4% of innocent travelers' sounds good. You have to think about the actual numbers involved in practice to see the problem, and the people putting the marketing materials together are probably hoping that you won't.
Sidebar: some actual numbers
Let's assume that 10 terrorists are flying every day in the US, and that at least a million people fly every day, again in the US. The system will flag roughly 40,000 people; of those, nine are terrorists. (And this is making a generous assumption on how many bad people are flying; the actual number is likely to be much, much lower.)
Do things get better if we have a second, completely independent test with the same false positive and false negative rates, and we only alarm on people who trigger both? Not really; 1,607 people trigger both, of which about 8 are terrorists. We've improved all the way up to a 1 in 178 chance that the person is a terrorist, and we've missed one fifth of the people we really want to catch.
(One million people a day is actually a bit low; see here. I am also using the company's goal figures, not their current results, to be as favorable to them as possible.)