Statistical Genetics and Genomics

Program Description

Statistical genetics and genomics aims to develop novel efficient and powerful statistical, probabilistic, and computational methods for analysis of genetics and genomics data. As outlined by Dr. Francis Collins and others in their Nature (2003) report, titled "A Vision for the Future of Genomics Research: A Blueprint for the Genomic Era", Phase two of the Human Genome Project has three major components: genomics to biology, genomics to health, and genomics to society. The new experimental technologies have generated and will generate even more high-dimensional and complex genomics data sets. Analysis of these data and inferences from these data are intrinsically statistical and fall within the realm of statistical genetics and genomics.

Genomics has become a central discipline of biomedical research. Data in genomics, comparative genomics, and high-throughput sequencing enable biologists to conduct their research in systems biology. There is a great need of new statistical methods for analyzing and integrating the ever-complex genetics and genomics data. The University of Pennsylvania has very strong research programs in life sciences, including many prominent research programs in genetics and genomics such as cancer genomics, cardiovascular genomics, and human gut microbiomes. The statistical genetics and genomics program works closely with these research programs, accordingly.

Research in statistical genetics and genomics at Penn covers many areas. There are several ongoing NIH R01 grants on developing new statistical methods for analysis of genetic and genomics data. These grants focus on statistical methods for analysis of genetic pathways and networks (H. Li), methods for admixture mapping (M. Li), methods for CNV analysis (R. Feng) and methods for analysis of mother-child designs (J Chen). Other methodological research includes methods for adjusting for population stratification (M. Li and N. Mitra), methods for studying the allelic-specific gene expressions (R. Xiao), methods for cancer risk prediction (J. Chen) and methods for eQTL data analysis (M.Li and H. Li). Faculty in this program also have active research projects related to analysis of next generation sequence data, including methods for rare variant association analysis, analysis of RNA-seq data, sequencing-based CNV analysis and RNA editing. Beside methods research, all faculty in the program are actively participating several large-scale genome wide association studies and developing new methodologies to analysis of such data, including novel methods for copy number variation analysis, methods for pathway-based analysis of GWAS data and methods for adjusting for population stratification. These collaborations have led to publications in Science, Nature, New England Journal of Medicine and Nature Genetics. Other collaborative areas include analysis of next generation sequences data and analysis of human microbiome data.

Program Members

Penn Medicine CCEB Home