The University of Arizona

Scattered disease-linked variants and convergent functions: discovery from big data integration

Scattered disease-linked variants and convergent functions: discovery from big data integration

Series: Statistics GIDP Colloquium
Location: Math 501
Presenter: Haiquan Li, Assistant Professor, Director for Translational Bioinformatics, Department of Medicine, University of Arizona

Genome-wide association studies (GWAS) has identified thousands of disease-linked single nucleotide polymorphisms (SNP) in the human genome. Most of them have a small effect size (OR<1.4) and locate independently across multiple chromosomes. It remains unclear how they collectively cause the diseases due to the issue of missing heritability. Classic tests of genetic interactions suffer from insufficient power. Here, we will present an integrative approach that leverages several omics datasets to obtain additional information beyond genotypes and thus reducing the number of hypotheses. We combine traditional semantic similarity for genes’ functions and very deep network permutations (100K times) to quantify the empirical significance of downstream function similarity of any pair of SNPs. This approach enabled us to discover a fundamental biological mechanism for complex diseases:  SNPs associated with the same disease are more likely to associate with the same downstream genes or functionally similar genes than unrelated diseases (OR>12). We also found 40-50% of prioritized SNP-pairs have significant genetic interactions from three independent GWAS datasets. These results provide new biological interpretation to genetic interactions and a “roadmap” of disease mechanisms emerging from GWAS SNPs, especially those out of coding regions.

Department of Mathematics, The University of Arizona 617 N. Santa Rita Ave. P.O. Box 210089 Tucson, AZ 85721-0089 USA Voice: (520) 621-6892 Fax: (520) 621-8322 Contact Us © Copyright 2018 Arizona Board of Regents All rights reserved