In a major step toward an era of personalized medicine, researchers reported Wednesday that they have sequenced the complete DNA material of more than 1,000 people from 14 population groups in Europe, Africa, East Asia and the Americas.
The report from the $120 million 1000 Genomes Project involved 700 scientists from laboratories in the U.S., Canada, China, Japan, Nigeria and Kenya, among others. Their results, published in Nature, offer the closest look yet at the differences in humankind's biological instruction set, documenting how myriad rare mutations may underpin many diseases and set the people of one locale apart from another in ways that shape their health.
All told, the scientists identified 38 million variations in the chemical letters of DNA that make up each of the average person's 23,000 or so genes and the DNA regions that control them—about 98 percent of all the estimated human variation in the world.
"We are getting to the point where an individual genome sequence can be a useful part of diagnosis," said statistical geneticist Gilean McVean at Oxford University in England, who led the effort. "If there is a variation that is present in just one in 100 people, we have found it,"
The immense compendium of genetic code—a catalog of human variation equal to 16 million file cabinets of data, or 30,000 DVDs—is meant to serve as a standard reference against which doctors could one day compare a patient's genome profile, even during a routine checkup.
Far from complete, the data already are straining the computer capacity of most laboratories to store and analyze. Moreover, the researchers expect to add genetic data from 1,500 more people within a few months. Earlier this year, Amazon.com Inc. volunteered to store the vast database in its cloud services, from which it could be freely accessed by anyone.
Generally, all humans share about 99 percent of the DNA code that shapes development, health, personality and other traits. But the common genetic variations that most people share account for only a fraction of the risk of inherited disease.