According to Nyquist theory, a signal x(t) whose signal energy is supported on the frequency interval [−B,B] may be reconstructed from samples {x(nT)} of the signal x(t), provided the rate fS=1/TS at which the samples are captured is sufficiently high, i.e., provided that fS is greater than 2B. Similarly, for a signal whose signal energy is supported on the frequency interval [A,B], the signal may be reconstructed from samples captured with sample rate greater than B−A. A fundamental problem with any attempt to capture a signal x(t) according to Nyquist theory is the large number of samples that are generated, especially when B (or B−A) is large. The large number of samples is taxing on memory resources and on the capacity of transmission channels.
Nyquist theory is not limited to functions of time. Indeed, Nyquist theory applies more generally to any function of one or more real variables. For example, Nyquist theory applies to functions of two spatial variables such as images, to functions of time and two spatial variables such as video, and to the functions used in multispectral imaging, hyperspectral imaging, medical imaging and a wide variety of other applications. In the case of an image I(x,y) that depends on spatial variables x and y, the image may be reconstructed from samples of the image, provided the samples are captured with sufficiently high spatial density. For example, given samples {I(nΔx,mΔy)} captured along a rectangular grid, the horizontal and vertical densities 1/Δx and 1/Δy should be respectively greater than 2Bx and 2By, where Bx and By are the highest x and y spatial frequencies occurring in the image I(x,y). The same problem of overwhelming data volume is experienced when attempting to capture an image according to Nyquist theory. The modern theory of compressive sensing is directed to such problems.
Compressive sensing relies on the observation that many signals (e.g., images or video sequences) of practical interest are not only band-limited but also sparse or approximately sparse when represented using an appropriate choice of transformation, for example, a transformation such as a Fourier transform, a wavelet transform or a discrete cosine transform (DCT). A signal vector v is said to be K-sparse with respect to a given transformation T when the transformation of the signal vector, Tv, has no more than K non-zero coefficients. A signal vector v is said to be sparse with respect to a given transformation T when it is K-sparse with respect to that transformation for some integer K much smaller than the number L of components in the transformation vector Tv.
A signal vector v is said to be approximately K-sparse with respect to a given transformation T when the coefficients of the transformation vector, Tv, are dominated by the K largest coefficients (i.e., largest in the sense of magnitude or absolute value). In other words, if the K largest coefficients account for a high percentage of the energy in the entire set of coefficients, then the signal vector v is approximately K-sparse with respect to transformation T. A signal vector v is said to be approximately sparse with respect to a given transformation T when it is approximately K-sparse with respect to the transformation T for some integer K much less than the number L of components in the transformation vector Tv.
Given a sensing device that captures images with N samples per image and in conformity to the Nyquist condition on spatial rates, it is often the case that there exists some transformation and some integer K very much smaller than N such that the transform of each captured image will be approximately K sparse. The set of K dominant coefficients may vary from one image to the next. Furthermore, the value of K and the selection of the transformation may vary from one context (e.g., imaging application) to the next. Examples of typical transforms that might work in different contexts include the Fourier transform, the wavelet transform, the DCT, the Gabor transform, etc.
Compressive sensing specifies a way of operating on the N samples of an image so as to generate a much smaller set of samples from which the N samples may be reconstructed, given knowledge of the transform under which the image is sparse (or approximately sparse). In particular, compressive sensing invites one to think of the N samples as a vector v in an N-dimensional space and to imagine projecting the vector v onto each vector in a series of M vectors {R(i): i=1, 2, . . . , M} in the N-dimensional space, where M is larger than K but still much smaller than N. Each projection gives a corresponding real number S(i), e.g., according to the expressionS(i)=<v,R(i)>,where the notation <v,R(i)> represents the inner product (or dot product) of the vector v and the vector R(i). Thus, the series of M projections gives a vector U including M real numbers: Ui=S(i). Compressive sensing theory further prescribes methods for reconstructing (or estimating) the vector v of N samples from the vector U of M real numbers and the series of measurement vectors {R(i): i=1, 2, . . . , M}. For example, according to one method, one should determine the vector x that has the smallest length (in the sense of the L1 norm) subject to the condition that ΦTx=U, where Φ is a matrix whose rows are the transposes of the vectors R(i), where T is the transformation under which the image is K sparse or approximately K sparse.
Compressive sensing is important because, among other reasons, it allows reconstruction of an image based on M measurements instead of the much larger number of measurements N recommended by Nyquist theory. Thus, for example, a compressive sensing camera would be able to capture a significantly larger number of images for a given size of image store, and/or, transmit a significantly larger number of images per unit time through a communication channel of given capacity.
As mentioned above, compressive sensing operates by projecting the image vector v onto a series of M vectors. As discussed in U.S. Pat. No. 8,199,244, issued Jun. 12, 2012 (invented by Baraniuk et al.) and illustrated in FIG. 1, an imaging device (e.g., camera) may be configured to take advantage of the compressive-sensing paradigm by using a digital micromirror device (DMD) 40. An incident lightfield 10 passes through a lens 20 and then interacts with the DMD 40. The DMD includes a two-dimensional array of micromirrors, each of which is configured to independently and controllably switch between two orientation states. Each micromirror reflects a corresponding portion of the incident light field based on its instantaneous orientation. Any micromirrors in a first of the two orientation states will reflect their corresponding light portions so that they pass through lens 50. Any micromirrors in a second of the two orientation states will reflect their corresponding light portions away from lens 50. Lens 50 serves to concentrate the light portions from micromirrors in the first orientation state onto a photodiode (or photodetector) situated at location 60. Thus, the photodiode generates a signal whose amplitude at any given time represents a sum of the intensities of the light portions from the micromirrors in the first orientation state.
The compressive sensing is implemented by driving the orientations of the micromirrors through a series of spatial patterns. Each spatial pattern specifies an orientation state for each of the micromirrors. The output signal of the photodiode is digitized by an A/D converter 70. In this fashion, the imaging device is able to capture a series of measurements {S(i)} that represent inner products (dot products) between the incident light field and the series of spatial patterns without first acquiring the incident light field as a pixelized digital image. The incident light field corresponds to the vector v of the discussion above, and the spatial patterns correspond to the vectors R(i) of the discussion above.
The incident light field may be modeled by a function I(x,y,t) of two spatial variables and time. Assuming for the sake of discussion that the DMD comprises a rectangular array, the DMD implements a spatial modulation of the incident light field so that the light field leaving the DMD in the direction of the lens 50 might be modeled by{I(nΔx,mΔy,t)*M(n,m,t)}where m and n are integer indices, where I(nΔx,mΔy,t) represents the portion of the light field that is incident upon that (n,m)th mirror of the DMD at time t. The function M(n,m,t) represents the orientation of the (n,m)th mirror of the DMD at time t. At sampling times, the function M(n,m,t) equals one or zero, depending on the state of the digital control signal that controls the (n,m)th mirror. The condition M(n,m,t)=1 corresponds to the orientation state that reflects onto the path that leads to the lens 50. The condition M(n,m,t)=0 corresponds to the orientation state that reflects away from the lens 50.
The lens 50 concentrates the spatially-modulated light field{I(nΔx,mΔy,t)*M(n,m,t)}onto a light sensitive surface of the photodiode. Thus, the lens and the photodiode together implement a spatial summation of the light portions in the spatially-modulated light field:
      S    ⁡          (      t      )        =            ∑              n        ,        m              ⁢                  I        ⁡                  (                                    n              ⁢                                                          ⁢              Δ              ⁢                                                          ⁢              x                        ,                          m              ⁢                                                          ⁢              Δ              ⁢                                                          ⁢              y                        ,            t                    )                    ⁢                        M          ⁡                      (                          n              ,              m              ,              t                        )                          .            
Signal S(t) may be interpreted as the intensity at time t of the concentrated spot of light impinging upon the light sensing surface of the photodiode. The A/D converter captures measurements of S(t). In this fashion, the compressive sensing camera optically computes an inner product of the incident light field with each spatial pattern imposed on the mirrors. The multiplication portion of the inner product is implemented by the mirrors of the DMD. The summation portion of the inner product is implemented by the concentrating action of the lens and also the integrating action of the photodiode.
The image carried in the incident light field may be reconstructed by executing any of wide variety of reconstruction algorithms on the intensity measurements of the signal S(t). One common feature of many reconstruction algorithms is that they generate successive estimates of the image, and for each of the image estimates, compute the transform Φ on the image estimate, i.e., the transform specified by the sensing matrix Φ. Thus, to improve the efficiency of the reconstruction algorithms, improved mechanisms are needed for computing transforms based on the sensing matrices useful in compressive sensing.
In the context of compressive sensing, it is generally desirable for the spatial patterns applied by the DMD to be incoherent with respect to the sparsity basis (i.e., the basis in which the image is sparse). One basic way to achieve incoherence is to generate rows of a structured matrix MST (such as a Hadamard transform or discrete Fourier transform) and then apply a randomizing permutation PRand of fine resolution to each of the generated rows. Thus, the spatial patterns are effectively drawn from rows of the product matrix MSTPRand. However, such random permutations may be expense to implement in hardware, especially when the number of the size of the image (the number of columns in the matrix MST is large). Thus, to improve the efficiency of compressive sensing, improved mechanisms for randomizing the spatial patterns applied by the DMD (or other signal modulating device) are needed.
Furthermore, because the objects being imaged by the compressive sensing camera may be moving and/or lighting conditions may be dynamic, it may be necessary for the DMD to apply the sequence of spatial patterns to the incident light field over a short period of time, to effectively capture a snapshot of the incident lightfield. Thus, mechanisms for efficiently generating rows of the sensing matrix Φ are needed. Furthermore, the sensing matrix Φ should have a structure that admits fast generation of rows.