Statistical methods can target exploration, prediction, or inference. While big-data applications have emphasized prediction, inference remains important; in particular, inference is closely related to assessing the uncertainty of coefficients and predictions. Data-driven methods for model selection and tuning minimize prediction error by trading bias for variance, but they are rarely (never?) able to narrow confidence intervals or increase certainty. If used naively, popular methods of data-driven model selection and tuning lead to overconfidence. Post-selection inference, a non-naive method of accounting for the effects of data-driven model tuning, rely on strong assumptions. Researchers should should recognize how hard it is to quantify uncertainty reliably when they use data-driven model tuning, and in many cases should abstain from tuning altogether.
Please join the event.
About Benjamin Bolker
Dr. Benjamin Bolker completed an undergraduate degree in mathematics and physics at Yale University and a Ph.D. in Zoology at Cambridge University, working on the dynamics of measles epidemics. He did a postdoc at Princeton University in ecology and evolutionary biology on spatial dynamics of plant and host-parasite communities, beginning a faculty position at the Department of Zoology (later Biology) at the University of Florida in 1999. He moved to McMaster University in 2010, where he has a joint appointment in Mathematics & Statistics and Biology and directs the School of Computational Science and Engineering. His research ranges broadly across ecology, evolution, and epidemiology, applying mathematical, statistical, and computational tools. He is especially interested in problems that involve parasites and disease, spatial population dynamics, estimation and inference of model parameters from observational data, or all three. In addition to many research papers, he is the author of two books (Ecological Models and Data in R and A Very Short Introduction to Infectious Disease, with Marta Wayne) and the author or maintainer of several widely used R packages.