Highly Scalable Gaussian Processes for Spatial Data Analysis

When and Where

Thursday, October 29, 2020 3:30 pm to 4:30 pm
Zoom, Passcode: 244689


Sudipto Banerjee, UCLA


The BIG DATA problem in multivariate geostatistics has fomented a rich literature on scalable methodologies for analysing massive multivariate spatial datasets. There are a variety of scalable spatial processes within the Bayesian paradigm that have been found especially attractive due to their flexibility and presence in hierarchical model settings. However, a major computational bottleneck for obtaining full Bayesian inference, including the inference for latent processes, arises from the slow MCMC sampling process over a high-dimensional parameter space. In this talk, I will discuss how a Matrix-variate Gaussian Process can be used for high-dimensional spatial data with the help of spatial processes constructed using Directed Acyclic Graphs (DAGs). Two specific classes of highly scalable spatial processes---the Nearest Neighbour Gaussian Process (Datta et al., 2016) and Meshed Gaussian Process (Peruzzi et al., 2020+)---will be compared and illustrated. I will present some case studies on remote-sensed data collected over tens of millions of locations.

Please join the event.

About Sudipto Banerjee

Sudipto Banerjee Sudipto Banerjee is Professor and Chair of the Department of Biostatistics in the School of Public Health at the University of California, Los Angeles (UCLA). Dr. Banerjee specializes in developing fast and scalable Bayesian methods in high-dimensional spatial and spatiotemporal data. His work has been recognized by many awards and fellowships, including the George W. Snedecor Award from the Committee of Presidents of Statistical Societies (COPSS) and the elected fellowships in the Institute of Mathematical Statistics (IMS), the American Statistical Association (ASA), and the International Society for Bayesian Analysis (ISBA).