Data science is transforming many traditional ways in which we approach scientific problems. While the abundance of data and algorithms generate a lot of excitement in statistical modeling, serious concerns about how to reliably and efficiently extract scientific knowledge from data and models are being raised.
In this talk, I will address particular reliability and efficiency issues that arise from my PhD study on a neuroscience project. Understanding how primates process visual information and recognize objects in an image is a major problem in neuroscience. First, I will describe how we reliably infer about the functional properties of neurons in visual cortex via the stability-driven DeepTune modeling framework. Given high performing predictive models with various architectures, I will discuss questions such as: What can we learn from these predictive models to infer properties of neurons? How much shall we trust the model-based interpretations?
Second, I will show new theoretical understandings of MCMC sampling algorithms on continuous state space. While MCMC sampling is useful in understanding functional mappings of predictive models in the aforementioned project and it is also ubiquitous in Bayesian inference, practical step-size choices of many modern sampling algorithms such as Metropolis adjusted Langevin Algorithm (MALA) and Hamiltonian Monte Carlo (HMC) remain difficult. I will establish explicit convergence guarantees of MALA and HMC for sampling from log-concave distributions, and provide intuitions for their fast convergence to guide practical step-size tuning.
This talk is based on two separate projects. It will feature joint work with Reza Abbasi-Asl, Adam Bloniarz, Michael Oliver, Ben D.B. Willmore, Jack L. Gallant, Bin Yu, and Raaz Dwivedi, Martin Wainwright, Bin Yu respectively.