Researchers locate geographic origins from DNA

One day soon, you may be able to pinpoint the geographic origins of your ancestors based on analysis of your DNA.

A study published online this week in Nature by an international team that included Cornell researchers describes the use of DNA to predict the geographic origins of individuals from a sample of Europeans, often within a few hundred kilometers of where they were born.

"What we found is that within Europe, individuals with all four grandparents from a given region are slightly more similar genetically to one another, on average, than to individuals from more distant regions," said Carlos Bustamante, associate professor of biological statistics and computational biology at Cornell and the paper's senior author. John Novembre, an assistant professor in the University of California-Los Angeles' Department of Ecology and Evolution, was lead author of the study that also included researchers from GlaxoSmithKline, the University of Chicago and the University of Lausanne (Switzerland).

"When these minute differences are compounded across the whole of their genome, we have surprisingly high power to predict where in Europe they came from," Bustamante added.

This is one of the first studies to examine genome-wide patterns of genetic variation across a large sample of Europeans, and to use these data to predict ancestry. The methodology has wide-ranging implications for using DNA samples from unrelated individuals to identify genes underlying complex diseases, as well as for forensics, personalized genomics and the study of recent human history.

Using data from a sample of almost 3,200 Europeans supplied by GlaxoSmithKline, the team analyzed more than 500,000 genetic points known as single nucleotide polymorphisms (SNPs), or minute sequence variations in DNA. The researchers focused their analysis on individuals for whom all the grandparents were believed to come from the same country. The team simplified and plotted the data, revealing that individuals with similar genetic structures clustered together on the plot in such a way that the major geographic features of Europe became distinguishable.

"What is really surprising is that when we summarize the data from 500,000 SNPs in just two dimensions, we see this striking map of Europe," said Novembre. "We can recognize the Iberian peninsula, the Italian peninsula, southeastern Europe, Turkey and Cyprus."

The resolution of the genetic map was so precise that the investigators were able to find genetic differences among the French, German and Italian-speaking Swiss individuals; with French speakers being more similar to the French, German speakers to Germans and Italian speakers to Italians.

Based on these observations, Novembre and colleagues from the University of Chicago developed a novel algorithm for classifying individuals geographically based on their patterns of DNA variation.

For well-sampled countries, this approach placed 50 percent of individuals within 310 kilometers (193 miles) of their reported origin, and 90 percent within 700 km (435 miles) of their origin. Across all populations, 50 percent of individuals were placed within 540 km (336 miles) of their reported origin and 90 percent of individuals within 840 km (523 miles). The findings excluded individuals with grandparents from different countries, since these were assigned locations between their grandparents' origins. Some next steps will be to infer origins for people with recent ancestry from multiple locations and to perform similar analyses for populations on other continents.

The study was funded by the Giorgi-Cavaglieri Foundation, the Swiss National Science Foundation, the National Science Foundation and National Institutes of Health in the United States, and GlaxoSmithKline.

Media Contact

Simeon Moss