Succinct representations of data (especially high-dimensional data) are paramount to promptly extract actionable information and reduce the storage and communication requirements related to sharing and archiving data. A fundamental observation, first made in the image processing community, is that a more efficient signal representation is possible if one uses an over-complete dictionary learned from the signals themselves rather than a fixed basis. Given a large number of signal samples, the dictionary learning problem aims to construct a dictionary that can be used to efficiently represent the data as a linear combination of its columns, also known as atoms. In this context, a more efficient data representation is one that uses a smaller number of atoms to achieve the desired signal reconstruction quality.
Usually, the dictionary is learned by fitting training data to a linear model representation for the data via a regularized least-squares criterion, where regularization is used to encourage a sparse structure in the vectors of regression coefficients. Naturally, the data samples used for learning the dictionary are assumed to adhere to the nominal process that generated the data. However, in many cases it is not possible to screen all data samples to guarantee that no datum behaves as an outlier, that is, deviates significantly from the remaining data. A new method is needed for robust dictionary learning.