Two Concise Representations for High-Dimensional Visual Data
Succinct data models have been widely used for imaging tasks such as recognition, denoising and superresolution. Their success depends crucially on the ability to learn models that accurately reflect the physical and statistical structure of the processes that generate the data. We describe two situations in which it is possible to do this, with provable performance guarantees. In the context of visual object recognition, we describe concise models which provably identify objects under arbitrary distant illumination. Our results show how to build a model that provably accepts any image that is sufficiently close to an image of the object of interest, and reject any that is not. We prove bounds that relate the complexity of the representation to the complexity of the object to be recognized, as measured by its defect from convexity. Our proofs and practical algorithms leverage t he the structure in the data—in particular, the existence of sparse and low-rank decompositions for the direct illumination operator. For other visual data problems such as denoising or superresolution, the ability to learn sparse models from data appears to be crucial to achieving good practical performance. For this problem, there is well-developed practice, but theoretical guarantees for the efficient algorithms are difficult to obtain, due to the bilinear nature of the problem. Here, again, we describe one situation in which mathematical guarantees are possible—namely, if the target dictionary is square and nonsingular, and the coefficients satisfy an appropriate random model. In this case, a simple heuristic based on convex programming guarantees to efficiently recover the target coefficients and dictionary.
This talk describes joint work with Yuqian Zhang (Columbia University), Cun Mu (Columbia University), Han-Wen Kuo (Columbia University), Dan Spielman (Yale University) and Huan Wang (Yahoo! Research).