New method connects proteins with mutations that lead to genetic disease

By Krishna Ramanujan

January 18, 2012

Provided

Robotic machinery, such as this Tecan Freedom Evo, are necessary for biomedical systems studies carried out by Haiyuan Yu's group. From left, Xiujuan Wang, Xiaomu Wei, Haiyuan Yu, and Jishnu Das.

For the first time, a new computational method allows researchers to identify which specific molecular mechanisms are altered by genetic mutations in proteins that lead to disease. And they can apply this method to any genetic disease.

Why is this important? Although researchers have produced long lists of disease-associated genes and their mutations, they have little information that links genes and mutations to how they affect the functions of proteins those genes express. Specific areas on the surfaces of proteins interact with other proteins, and when such interfacing areas mutate, it could disrupt those protein-protein interactions and lead to disease, according to the study published in the journal Nature Biotechnology.

The goal is to find the mutations that drive such diseases as cancer and other complex genetic diseases, said Haiyuan Yu, an assistant professor in the Department of Biological Statistics and Computational Biology and the  Weill Institute for Cell and Molecular Biology and the paper's senior author. Xiujuan Wang, a postdoctoral researcher in Yu's lab, is the paper's lead author.

The method incorporates massive databases of more than 50,000 protein structures, thousands of protein-protein interactions and 3,949 disease genes associated with 3,453 diseases.

The computational analysis of disease-associated genes and their mutations reveals in 3-D the locations of changes on proteins based on those mutations and genes. Though the results are computational and based on probabilities, the researchers conducted physical experiments in a few cases that completely verified the computational analysis.

The problem, Yu said, is that there may be many mutations identified for each disease gene or for each disease by current large-scale genome sequencing projects and genomewide association studies. "If the mutations  are enriched on a protein's interaction interface [areas where proteins interact with other proteins], then they are more likely to be a driver of disease," Yu said. By combining known knowledge of genes, mutations and protein structures, "it's a very nice method to determine driver mutations, and allows us to say, we should focus on these mutations and not on those."

For example, in an analysis of colorectal cancer, Yu and colleagues found that out of nine genetic mutations that affect two proteins involved in DNA repair, three of those mutations showed up on the interface areas of the two proteins. "If they are on the interface, they are more likely to be drivers of disease," Yu said. "Then we did experiments that completely verified the computational analysis," he added.

The method can be applied to any disease, mutation and protein, he said.

Co-authors include researchers at Weill Cornell Medical College in New York City and Maastricht University in the Netherlands.

The study was funded by a startup fund from Cornell to Yu and National Institutes of Health grants awarded to Weill Cornell Medical College.

Life Sciences & Veterinary Medicine