Statisticians who have delivered their own judgments on the state of Olympic judging have a troubling message for fans of such sports as gymnastics, diving, boxing and dressage: There is no surefire way to remove national bias from the results.
In gymnastics, the highest and the lowest score for execution are tossed out in attempt to weed out possibly biased appraisals. But if more than one score is biased upward, or downward, that won't do the trick. In diving, officials from competitors' countries if possible aren't assigned to the semi-final and final—but they can be in earlier rounds. In the Winter Olympics, figure skating tried obscuring which judge awarded which score, on the theory that this would thwart collusion by judges. One study found that increased bias.
Eric Zitzewitz, the Dartmouth College economist who wrote that paper, says it is difficult to design one system to prevent all forms of bad judging, be it corruption—suspected but never proven in figure-skating a decade ago—intentional bias or unconscious bias. "It's very hard to design a perfect system in a vacuum," he says. "It depends on the specific form of activity you're trying to prevent."
Some sports, such as gymnastics, use a "trimmed mean," in which the highest and lowest scores are discarded and the remaining ones averaged. The assumption is that the biased scores are likely to be extreme ones.
Warren Smith, a retired mathematician from Temple University, advocates the trimmed mean for judged sports, finding it superior to some other Olympic systems. But Yale University statistician John W. Emerson found a drawback to the approach when he studied the 2000 diving scores. One Chinese judge showed the least evidence of bias toward any particular country's competitors. If he rated one diver higher than another by a certain margin, that tended to be the consensus margin between those divers, regardless of uniform.
The problem is that he tended to award higher scores to everyone—either out of enthusiasm or a different sense of what constituted a perfect dive. That meant his scores were more likely to be tossed than those of judges who did appear to systematically favor one country's divers over another.
"If you do an incorrect analysis, you think he's guilty as sin," Prof. Emerson says.