Genetic risk prediction is a major focus in human genetics research. Continued success in genome-wide association studies (GWAS) in the past decade has facilitated the development of polygenic risk scores (PRS) that aggregate the effects of millions of genetic variants for many complex traits. Compared to earlier methods requiring individual-level data for model training, PRS which only relies on marginal association data from GWAS is much more generally applicable due to the wide availability of summary statistics. However, several lingering challenges create a significant gap between PRS methodology and applications. In this talk, I will mainly discuss two issues involving model fine-tuning and cross-ancestry portability. First, most PRS models include tuning parameters which improve predictive performance when properly selected. However, existing model-tuning methods require individual-level genetic data independent from both training and testing samples. These data almost never exist in practice. We introduced an innovative statistical framework named PUMAS to conduct classic modeling tuning procedures (e.g., cross validation) using GWS summary statistics alone as input. Built on this framework, I will also introduce our recent effort to benchmark PRS models and perform ensemble learning without requiring external data for model fitting. Second, PRS trained from existing GWAS are known to have substantially reduced predictive accuracy in non-European populations, limiting its clinical utility and raising concerns about health disparities across ancestral populations. I will introduce our new approach named X-Wing designed to tackle this problem. X-Wing quantifies genetic effect correlation between populations, employs annotation-dependent statistical regularization, and combines population-specific PRS through ensemble learning, which leads to substantially improved performance compared to existing methods. I will give many examples to demonstrate the superior performance of these approaches. Taken together, we believe these new advances resolve several fundamental problems without a current solution and will shed important light on the future application of genetic prediction for human complex traits.
Dr. Lu received his Bachelor’s at Tsinghua University (Mathematics, 2012) and Ph.D. at Yale University (Biostatistics, 2017). He is currently an Assistant Professor in the Department of Biostatistics and Medical Informatics at UW-Madison. The Lu Lab develops and employs statistical methods to identify and interpret genetic associations for human complex traits. In particular, areas of expertise in the Lu Lab include genome-wide association study, non-coding genome annotation, genetic correlation estimation, polygenic risk prediction, and gene-environment interaction.