Sparsely populated data sets are utilized in numerous technical fields. For example, signals representing audio, images and video can be well-approximated by a small subset of elements from a dictionary. Neurophysiological data obtained from the brain cortex has shown that the human brain in effect performs sparse coding of stimuli in a parallel manner using a large number of interconnected neurons. In this context, a sparse code refers to a representation where a relatively small number of neurons are active with the majority of neurons being inactive or showing low activity in a population. In more general contexts, a sparse code can be viewed as the representation of a signal by a small subset of elements, taken from a typically large dictionary of elements. The dictionary itself describes a domain in which the signal is interpreted, such as the frequency domain. It has been discovered that many physical signals can be represented by sparse codes, once interpreted in a suitable domain. Provided the domain is well-chosen, the representation of signals by sparse codes can greatly simplify and/or make more efficient many tasks of signal processing and transmission. As a result, sparse coding has been used in recent years as a strong tool for the processing of image, video, and sound, see for example U.S. Pat. No. 7,783,459, which is incorporated herein by reference, and the following articles: R. Pichevar, H. Najaf-Zadeh, and L. Thibault, “A biologically-inspired low-bit-rate universal audio coder,” in Audio Eng. Society Conv., Austria, 2007; R. Pichevar and H. Najaf-Zadeh, “Pattern extraction in sparse representations with application to audio coding,” in European Signal Processing Conf., Glasgow, UK, 2009; L. Perrinet, M. Samuelides, and S. Thorpe, “Coding static natural images using spiking event times: do neurons cooperate?” IEEE Transactions on Neural Networks, vol. 15(5), pp. 1164-1175, 2004; K. Herrity, A. Gilbert, and J. Tropp, “Sparse approximation via iterative thresholding.” in IEEE International Conference on Acoustics, Speech, and Signal Processing, Toulouse, France, 2006. For example, image data that has been transformed by successive applications of wavelet transforms tends to have large portions occupied by zero and near-zero values, especially if the data is subjected to a data quantization step prior to encoding. As another example, the '459 patent teaches a neural network that implements a Local Competitive Algorithm (LCA) to sparsely code images using Gabor kernels as dictionary elements. A U.S. patent application Ser. No. 13/188,915 filed Jul. 22, 2011, which is assigned to the assignee of the present application and is incorporated herein by reference, discloses a Perceptual LCA (PLCA) for perceptual sparse coding of time-dependent audio signals using Gammatone or Gammachirp dictionary elements.
A sparse data set can be represented as a data array or a vector wherein a large proportion of data are zeros, with non-zero values sparsely distributed across the data set. As an illustration, audio coding is based on a sequence of steps that is conveniently described using mathematical notations as follows. Given a vector n representing a digitized audio signal, the first step is to represent n in a suitable transform domain, denoted by a transformation function ψ(.), as follows:x=ψ(n)  (1)
where a resulting length-L vector x often contains only M<<L significant, i.e. non-zero, elements, where the level of significance is determined by either perceptual or mathematical criteria. Generally, x can be multidimensional, but is assumed to be vectorized in the context of this specification. The property that x only contains M<<L non-negligible elements is referred to as sparseness of the vector x, and is at the foundation of transform coding. Generally, in the context of the present application we will be referring to a dataset or a vector representing it as sparse if M is less than L.
In a typical application, the process of encoding and storing the signal n typically includes the following steps: a) transforming n into x, b) perceptually or otherwise thresholding the vector x to retain only M non-zero values, yielding a sparse vector {circumflex over (x)}, iii) quantizing the M non-zero values of {circumflex over (x)}, and then iv) encoding them as well as their position in the vector/data set.
At a receiver, or at playback in the context of audio, the procedure amounts to decoding the quantized values and positions, placing them into a vector of length L and then applying the inverse to ψ(.)transformation to recover an approximation {circumflex over (n)} of n.
One drawback that is inherent to handling and transmitting sparse vectors, regardless of how they were obtained, is a potentially large overhead required to encode the positions of the active elements in {circumflex over (x)}, see. for example, R. G. Baraniuk, “Compressive Sensing,” Lecture Notes in IEEE Signal Processing Magazine, Vol. 24, No. 4, pp. 118-120, July 2007, which is incorporated herein by reference. A straightforward approach to encoding the M positions from a position set of length L would require a binary vector of length log2
      (                            L                                      M                      )    ≤      L    .  
Thus, it is an object of the present invention to address this deficiency by providing a compressive coder for sparse data sets that eliminates the need to transmit or store at least some of the sparse data positions, thereby reducing the overhead.