Precise High-Dimensional Asymptotics for AdaBoost via Max-Margins and Min-Norm Interpolants

When and Where

Thursday, November 04, 2021 3:30 pm to 4:30 pm
Online

Speakers

Pragya Sur, Harvard University

Description

This talk will introduce a precise high-dimensional asymptotic theory for AdaBoost on separable data, taking both statistical and computational perspectives. We will consider the common modern setting where the number of features p and the sample size n are both large and comparable, and in particular, look at scenarios where the data is asymptotically separable. Under a class of statistical models, we will provide an (asymptotically) exact analysis of the max-min-L1-margin and the min-L1-norm interpolant. In turn, this will characterize the generalization error of AdaBoost, when the algorithm interpolates the training data and maximizes an empirical L1 margin. On the computational front, we will provide a sharp analysis of the stopping time when boosting approximately maximizes the empirical L1 margin. Our theory provides several insights into properties of AdaBoost; for instance, the larger the dimensionality ratio p/n, the faster the optimization reaches interpolation. Our statistical and computational arguments can handle (1) finite-rank spiked covariance models for the feature distribution and (2) variants of AdaBoost corresponding to general Lq-geometry, for q in [1,2]. This is based on joint work with Tengyuan Liang.

Please join the event.

About Pragya Sur

Picture of Pragya Sur Pragya Sur is an Assistant Professor in the Statistics Department at Harvard University. Her research broadly spans high-dimensional statistics, statistical machine learning, robust inference and prediction for multi-study/multi-environment heterogeneous data. She is simultaneously interested in applications of large scale statistical methods to computational neuroscience and genetics. Her research is currently supported by a William F. Milton Fund and an NSF DMS award. Previously, she was a postdoctoral fellow at the Center for Research on Computation and Society, Harvard John A. Paulson School of Engineering and Applied Sciences. She received a Ph.D. in Statistics from Stanford University in 2019, where her thesis was awarded the Theodore W. Anderson Theory of Statistics Dissertation Award.