Abstracts

Tuesday, September 3rd, 2013

9:00 am10:30 am

The rapid growth in the size and scope of datasets in science and technology has created a need for novel foundational perspectives on data analysis that blend the computational and statistical sciences. That classical perspectives from these fields are not adequate to address emerging problems in "Big Data" is apparent from their sharply divergent nature at an elementary level—in computer science, the growth of the number of data points is a source of "complexity" that must be tamed via algorithms or hardware, whereas in statistics, the growth of the number of data points is a source of "simplicity" in that inferences are generally stronger and asymptotic results can be invoked. Indeed, if data are a statistician's principal resource, why should more data be burdensome in some sense? Shouldn't it be possible to exploit the increasing inferential strength of data at scale to keep computational complexity at bay? I present several research vignettes that attempt to bring together computational and statistical thinking.

Related Links

A scalable bootstrap for massive data. A. Kleiner, A. Talwalkar, P. Sarkar and M. I. Jordan. Journal of the Royal Statistical Society, Series B, in press.
arxiv.org/abs/1112.5016

On statistics, computation and scalability. M. I. Jordan. Bernoulli, 19, 1378-1390, 2013.
www.cs.berkeley.edu/~jordan/papers/BEJSP17.pdf

Computational and statistical tradeoffs via convex relaxation. V. Chandrasekaran and M. I. Jordan. Proceedings of the National Academy of Sciences, 110, E1181-E1190, 2013.
www.pnas.org/content/110/13/E1181.abstract

11:00 am12:30 pm

For many computational problems, it is beneficial to see them through the prism of high-dimensional geometry. For example, one can represent an object (e.g., an image) as a high-dimensional vector, depicting hundreds or more features (e.g., pixels). Often direct or classical solutions to such problems suffer from the so-called "curse of dimensionality": the performance guarantees tend to have exponential dependence on the dimension. Modern tools from high-dimensional computational geometry address this obstacle.

These lectures will give an overview of some key problems and tools in the area, including nearest neighbor search, dimensionality reduction, embeddings and intrinsic dimension, and may touch upon some connections to related areas, such as sublinear algorithms.

The second session of this talk will take place on Wednesday, September 4 from 9:00 am – 10:30 am.

2:00 pm3:00 pm

Random matrices have come to play a significant role in computational mathematics and statistics. Established methods from random matrix theory have led to striking advances in these areas, but ongoing research has generated difficult questions that cannot be addressed without new tools. The purpose of this tutorial is to introduce some recent techniques, collectively called matrix concentration inequalities, that can simplify the study of many types of random matrices. These results parallel classical tail bounds for scalar random variables, such as the Bernstein inequality, but they apply directly to matrices. In particular, matrix concentration inequalities can be used to control the spectral norm of a sum of independent random matrices by harnessing basic properties of the summands. Many variants and extensions are now available, and the outlines of a larger theory are starting to emerge. These new techniques have already led to advances in many areas, including partial covariance estimation, randomized schemes for low-rank matrix decomposition, relaxation and rounding methods for combinatorial optimization, construction of maps for dimensionality reduction, techniques for subsampling large matrices, analysis of sparse approximation algorithms, and many others.

The second session of this talk will take place on Tuesday, September 3 from 3:15 pm – 4:15 pm; the third session of this talk will take place on Tuesday, September 3 from 4:30 pm – 5:30 pm.

Relevant Papers

J. A. Tropp, "User-Friendly Tools for Random Matrices: An Introduction", 2012. Available at users.cms.caltech.edu/~jtropp/notes/Tro12-User-Friendly-Tools-NIPS.pdf
J. A. Tropp, "User-Friendly Tail Bounds for Sums of Random Matrices", FOCM 2012. Available at users.cms.caltech.edu/~jtropp/papers/Tro12-User-Friendly-FOCM.pdf
L. Mackey et al., "Matrix concentration inequalities via the method of exchangeable pairs," 2012. Available at users.cms.caltech.edu/~jtropp/papers/MJCFT12-Matrix-Concentration.pdf

3:15 pm4:15 pm

Random matrices have come to play a significant role in computational mathematics and statistics. Established methods from random matrix theory have led to striking advances in these areas, but ongoing research has generated difficult questions that cannot be addressed without new tools. The purpose of this tutorial is to introduce some recent techniques, collectively called matrix concentration inequalities, that can simplify the study of many types of random matrices. These results parallel classical tail bounds for scalar random variables, such as the Bernstein inequality, but they apply directly to matrices. In particular, matrix concentration inequalities can be used to control the spectral norm of a sum of independent random matrices by harnessing basic properties of the summands. Many variants and extensions are now available, and the outlines of a larger theory are starting to emerge. These new techniques have already led to advances in many areas, including partial covariance estimation, randomized schemes for low-rank matrix decomposition, relaxation and rounding methods for combinatorial optimization, construction of maps for dimensionality reduction, techniques for subsampling large matrices, analysis of sparse approximation algorithms, and many others.

The first session of this talk will take place on Tuesday, September 3 from 2:00 pm – 3:00 pm; the third session of this talk will take place on Tuesday, September 3 from 4:30 pm – 5:30 pm.

Relevant Papers

J. A. Tropp, "User-Friendly Tools for Random Matrices: An Introduction", 2012. Available at users.cms.caltech.edu/~jtropp/notes/Tro12-User-Friendly-Tools-NIPS.pdf
J. A. Tropp, "User-Friendly Tail Bounds for Sums of Random Matrices", FOCM 2012. Available at users.cms.caltech.edu/~jtropp/papers/Tro12-User-Friendly-FOCM.pdf
L. Mackey et al., "Matrix concentration inequalities via the method of exchangeable pairs," 2012. Available at users.cms.caltech.edu/~jtropp/papers/MJCFT12-Matrix-Concentration.pdf

4:30 pm5:30 pm

Random matrices have come to play a significant role in computational mathematics and statistics. Established methods from random matrix theory have led to striking advances in these areas, but ongoing research has generated difficult questions that cannot be addressed without new tools. The purpose of this tutorial is to introduce some recent techniques, collectively called matrix concentration inequalities, that can simplify the study of many types of random matrices. These results parallel classical tail bounds for scalar random variables, such as the Bernstein inequality, but they apply directly to matrices. In particular, matrix concentration inequalities can be used to control the spectral norm of a sum of independent random matrices by harnessing basic properties of the summands. Many variants and extensions are now available, and the outlines of a larger theory are starting to emerge. These new techniques have already led to advances in many areas, including partial covariance estimation, randomized schemes for low-rank matrix decomposition, relaxation and rounding methods for combinatorial optimization, construction of maps for dimensionality reduction, techniques for subsampling large matrices, analysis of sparse approximation algorithms, and many others.

The first session of this talk will take place on Tuesday, September 3 from 2:00 pm – 3:00 pm; the second session will take place on Tuesday, September 3 from 3:15 pm – 4:15 pm.

Relevant Papers

J. A. Tropp, "User-Friendly Tools for Random Matrices: An Introduction", 2012. Available at users.cms.caltech.edu/~jtropp/notes/Tro12-User-Friendly-Tools-NIPS.pdf
J. A. Tropp, "User-Friendly Tail Bounds for Sums of Random Matrices", FOCM 2012. Available at users.cms.caltech.edu/~jtropp/papers/Tro12-User-Friendly-FOCM.pdf
L. Mackey et al., "Matrix concentration inequalities via the method of exchangeable pairs," 2012. Available at users.cms.caltech.edu/~jtropp/papers/MJCFT12-Matrix-Concentration.pdf

Wednesday, September 4th, 2013

9:00 am10:30 am

For many computational problems, it is beneficial to see them through the prism of high-dimensional geometry. For example, one can represent an object (e.g., an image) as a high-dimensional vector, depicting hundreds or more features (e.g., pixels). Often direct or classical solutions to such problems suffer from the so-called "curse of dimensionality": the performance guarantees tend to have exponential dependence on the dimension. Modern tools from high-dimensional computational geometry address this obstacle.

These lectures will give an overview of some key problems and tools in the area, including nearest neighbor search, dimensionality reduction, embeddings and intrinsic dimension, and may touch upon some connections to related areas, such as sublinear algorithms.

The first session of this talk will take place on Tuesday, September 3 from 11:00 am – 12:30 pm.

11:00 am12:30 pm

The introduction of randomization over the last decade into the design and analysis of algorithms for matrix computations has provided a new paradigm, particularly appropriate for many very large-scale applications, as well as a complementary perspective to traditional numerical linear algebra approaches to matrix computations. These novel approaches have been motivated by technological developments in many application areas that permit the automatic generation of large data sets, and they are of particular interest for large-scale data analysis since many data sets are naturally modeled as matrices.

This tutorial will describe the theory underlying randomized algorithms for matrix problems (such as matrix multiplication, least-squares regression, least absolute deviations regression, and low-rank matrix approximation) as well as where that theory is headed. Although the use of randomization is sometimes a relatively-straightforward application of Johnson-Lindenstrauss ideas, Euclidean spaces are much more structured objects than general metric spaces, and thus the best—both in theory and in numerical analysis and data analysis practice—randomized matrix algorithms take this into account. An emphasis will be placed on highlighting a few key concepts, several of which have natural statistical interpretations, that are responsible for the improvements in worst case theory and also for the usefulness of these algorithms in large-scale numerical and data applications.

The second session of this talk will take place on Wednesday, September 4 from 2:00 pm – 3:30 pm.

2:00 pm3:30 pm

The introduction of randomization over the last decade into the design and analysis of algorithms for matrix computations has provided a new paradigm, particularly appropriate for many very large-scale applications, as well as a complementary perspective to traditional numerical linear algebra approaches to matrix computations. These novel approaches have been motivated by technological developments in many application areas that permit the automatic generation of large data sets, and they are of particular interest for large-scale data analysis since many data sets are naturally modeled as matrices.

This tutorial will describe the theory underlying randomized algorithms for matrix problems (such as matrix multiplication, least-squares regression, least absolute deviations regression, and low-rank matrix approximation) as well as where that theory is headed. Although the use of randomization is sometimes a relatively-straightforward application of Johnson-Lindenstrauss ideas, Euclidean spaces are much more structured objects than general metric spaces, and thus the best—both in theory and in numerical analysis and data analysis practice—randomized matrix algorithms take this into account. An emphasis will be placed on highlighting a few key concepts, several of which have natural statistical interpretations, that are responsible for the improvements in worst case theory and also for the usefulness of these algorithms in large-scale numerical and data applications.

The first session of this talk will take place on Wednesday, September 4 from 11:00 am – 12:30 pm.

4:00 pm5:30 pm

Machine learning and computational statistics problems involving large datasets have proved to be a rich source of interesting and challenging optimization problems in recent years. The challenges arising from the complexity of these problems and the special requirements for their solutions have brought a wide range of optimization algorithms into play. We start this talk by surveying the application space, outlining several important analysis and learning tasks, and describing the contexts in which such problems are posed. We then describe optimization approaches that are proving to be relevant, including stochastic gradient methods, sparse optimization methods, first-order methods, coordinate descent, higher-order methods, and augmented Lagrangian methods. We also discuss parallel variants of some of these approaches.

The second session of this talk will take place on Thursday, September 5 from 9:00 am – 10:30 am.

Thursday, September 5th, 2013

9:00 am10:30 am

Machine learning and computational statistics problems involving large datasets have proved to be a rich source of interesting and challenging optimization problems in recent years. The challenges arising from the complexity of these problems and the special requirements for their solutions have brought a wide range of optimization algorithms into play. We start this talk by surveying the application space, outlining several important analysis and learning tasks, and describing the contexts in which such problems are posed. We then describe optimization approaches that are proving to be relevant, including stochastic gradient methods, sparse optimization methods, first-order methods, coordinate descent, higher-order methods, and augmented Lagrangian methods. We also discuss parallel variants of some of these approaches.

The first session of this talk will take place on Wednesday, September 4 from 4:00 pm – 5:30 pm.

11:00 am12:30 pm

The area of high-dimensional statistics concerns problems in which the ambient dimension is of the same order or substantially larger than the sample size. Although its roots are classical (dating back to Kolmogorov), it has become the focus of increasing attention in the modern era of big data. In this tutorial lecture, we survey some recent progress in different areas, including sparse estimation, covariance estimation, low-rank matrix recovery, graphical model selection, and non-parametric regression, with emphasis on the techniques used to obtain non-asymptotic results.

The second session of this talk will take place on Thursday, September 5 from 2:00 pm – 3:30 pm.

2:00 pm3:30 pm

The area of high-dimensional statistics concerns problems in which the ambient dimension is of the same order or substantially larger than the sample size. Although its roots are classical (dating back to Kolmogorov), it has become the focus of increasing attention in the modern era of big data. In this tutorial lecture, we survey some recent progress in different areas, including sparse estimation, covariance estimation, low-rank matrix recovery, graphical model selection, and non-parametric regression, with emphasis on the techniques used to obtain non-asymptotic results.

The first session of this talk will take place on Thursday, September 5 from 11:00 am – 12:30 pm.

4:00 pm5:30 pm

One natural way to deal with the challenge of Big Data is to make the data smaller. That is, to seek a compact (sublinear) representation of the data so that certain properties are (approximately) preserved. We can think of these as a generalization of sufficient statistics for properties of the data. The area of "streaming algorithms" seeks algorithms which can build such a summary as information is processed incrementally. An important class of streaming algorithms are sketches: carefully designed random projections of the input data that can be computed efficiently under the constraints of the streaming model. These have the attractive property that they can be easily computed in parallel over partitions of the input. They aim at optimizing a variety of properties: the size of the summary; the time required to compute the summary; the number of 'true' random bits required; and the accuracy guarantees that result.

This tutorial will present some of the powerful results that have emerged from these efforts:

  • sketches for sparse recovery over high dimensional inputs, with applications to compressed sensing and machine learning;
  • effective sketches for Euclidean space, providing fast instantiations of the Johnson-Lindenstrauss Lemma;
  • techniques for sampling from the support of a vector, and estimating the size of the support set;
  • applications of these ideas to large graph computations and linear algebra;
  • lower bounds on what is possible to summarize, arising from Communication Complexity and Information Complexity.

​The second session of this talk will take place on Friday, September 6 from 9:00 am – 10:30 am.

Friday, September 6th, 2013

9:00 am10:30 am

One natural way to deal with the challenge of Big Data is to make the data smaller. That is, to seek a compact (sublinear) representation of the data so that certain properties are (approximately) preserved. We can think of these as a generalization of sufficient statistics for properties of the data. The area of "streaming algorithms" seeks algorithms which can build such a summary as information is processed incrementally. An important class of streaming algorithms are sketches: carefully designed random projections of the input data that can be computed efficiently under the constraints of the streaming model. These have the attractive property that they can be easily computed in parallel over partitions of the input. They aim at optimizing a variety of properties: the size of the summary; the time required to compute the summary; the number of 'true' random bits required; and the accuracy guarantees that result.

This tutorial will present some of the powerful results that have emerged from these efforts:

  • sketches for sparse recovery over high dimensional inputs, with applications to compressed sensing and machine learning;
  • effective sketches for Euclidean space, providing fast instantiations of the Johnson-Lindenstrauss Lemma;
  • techniques for sampling from the support of a vector, and estimating the size of the support set;
  • applications of these ideas to large graph computations and linear algebra;
  • lower bounds on what is possible to summarize, arising from Communication Complexity and Information Complexity.

​The first session of this talk will take place on Thursday, September 5 from 4:00 pm – 5:30 pm.

11:00 am12:30 pm

Why is it that graphs play such a key role in a essentially all applications of mathematics to real-life problems? The main reason is that there are so many situations in natural sciences/engineering/social sciences and more where we are trying to understand a complex system that consists of interacting pairs. These can be molecules in some biological system or two companies engaged in a business transaction etc. But what do you do when the underlying interactions may involve more than two parties? We are currently much less prepared to deal with such problems. It is natural to introduce into the game higher-dimensional simplicial complexes (a graph is a one-dimensional simplicial complex). This leads us into a fascinating set of problems and challenges, some of which already resolved and many still widely open. As it turns out graphs are just one example where a basic combinatorial object is inherently one-dimensional and where higher-dimensional counterparts suggest themselves. I will introduce recent work on high-dimensional permutations and tournaments.

These lectures are based on joint papers with several coauthors, e.g.

​The second session of this talk will take place on Friday, September 6 from 2:00 pm – 3:30 pm.

2:00 pm3:30 pm

Why is it that graphs play such a key role in a essentially all applications of mathematics to real-life problems? The main reason is that there are so many situations in natural sciences/engineering/social sciences and more where we are trying to understand a complex system that consists of interacting pairs. These can be molecules in some biological system or two companies engaged in a business transaction etc. But what do you do when the underlying interactions may involve more than two parties? We are currently much less prepared to deal with such problems. It is natural to introduce into the game higher-dimensional simplicial complexes (a graph is a one-dimensional simplicial complex). This leads us into a fascinating set of problems and challenges, some of which already resolved and many still widely open. As it turns out graphs are just one example where a basic combinatorial object is inherently one-dimensional and where higher-dimensional counterparts suggest themselves. I will introduce recent work on high-dimensional permutations and tournaments.

These lectures are based on joint papers with several coauthors, e.g.

​The first session of this talk will take place on Friday, September 6 from 11:00 am – 12:30 pm.

4:00 pm5:30 pm

There are two way randomness enters analysis of Big Data: either the input data or the algorithm may "toss coins" (or even both). The former involves stochastic models of data. The idea in theory is to get provable algorithms which work in the worst case under such models. The talk will recap mixture models, their use in clustering, planted dense graphs, and other problems and algorithms that work under the model. Coin tossing by the algorithm will be touched upon depending on the coverage of this topic in other lectures.