CCSI Weekly Seminar
Pragya Sur (Harvard)
Room 116
Join fellow CCSI participants for a weekly seminar.
Title: Precise high-dimensional asymptotics for AdaBoost via max-margins & min-norm interpolants
This talk will introduce a precise high-dimensional asymptotic theory for AdaBoost on separable data, taking both statistical and computational perspectives. We will consider the common modern setting where the number of features p and the sample size n are both large and comparable, and in particular, look at scenarios where the data is asymptotically separable. Under a class of statistical models, we will provide an (asymptotically) exact analysis of the max-min-L1-margin and the min-L1-norm interpolant. In turn, this will characterize the generalization error of AdaBoost, when the algorithm interpolates the training data and maximizes an empirical L1 margin. On the computational front, we will provide a sharp analysis of the sstopping time when boosting approximately maximizes the empirical L1 margin. Our theory provides several insights into properties of AdaBoost; for instance, the larger the dimensionality ratio p/n, the faster the optimization reaches interpolation. Our statistical and computation arguments can handle (1) finite-rank spiked covariance models of the feature distribution and (2) variants of AdaBoost corresponding to general Lq-geometry, for q in [1,2]. This is based on joint work with Tengyuan Liang.