Throughout most of the 20th century, cameras captured images on film by a photochemical process to produce pictures that represent the original scene observed by the camera. Towards the latter part of the 20th Century, solid state image sensors in the form of CCDs (charge coupled devices) and CMOS (complementary metal-oxide-semiconductor) image sensors took the place of film to enable today the ubiquitous digital camera. Digital cameras do not require film to capture images, and have the advantage of capturing images electronically that take the form of digital data which may be stored easily for later editing, processing, and printing. In some applications, the digital imagery may be sent to a computer for real-time processing in order to generate an output. These latter configurations may also be referred to as cameras as well as machine vision systems or vision sensors.
FIG. 1 depicts an exemplary generic digital camera 101. A lens 103 focuses light 105 from the environment 116 surrounding the camera 101 onto the focal plane 107 of an image sensor 109. The lens 103 is shown in the figure as a single element lens, but alternatively it may be a pinhole or it may comprise a set of lens elements and/or reflective elements, e.g. mirrors. In all such design configurations, the lens 103 (or other optics) is positioned a distance above the focal plane 107, forming cavity 117, so that light 105 is focused to form an image onto the focal plane 107. The lens 103 may be fixed to one location and a predetermined distance above the focal plane 107, or the lens 103 may be configured so that it may be moved closer or further from the focal plane 107 to bring the image into focus. An opaque enclosure 111 supports the lens 103 and ensures that the only light striking the image sensor 109 is light coming through the lens 103. The image sensor 109 may be electronically interfaced with the rest of the camera electronics via wire bonds 113 or another connection method. A processor 115, typically a microcontroller, a DSP (digital signal processor) chip, or other digital circuit extracts a digital image from the image sensor 109 based on the image formed on the focal plane 107. The digital image may be processed, stored, and/or transmitted on as an output, depending on the configuration of the camera 101 and its application.
In earlier cameras the image sensor 109 would be replaced by film, which as described above captures images photochemically. The photochemical process of “developing the film” may thus conceptually replace the function performed by the image sensor 109 and the processor 115.
While the exemplary generic digital camera 101 shown in FIG. 1 has the advantage of relative simplicity and maturity, it has several significant disadvantages. First, the enclosure 111 and mechanism for mounting the lens 103 needs to be rigid and constructed to hold the lens 103 at the desired location as well as form cavity 117. This potentially results in a bulky and heavy structure. Second, there are significant trade-offs between camera specifications such as F-stop, focal length, and field of view. These trade-offs are such that constructing a camera to have both a small F-stop (to gather large amounts of light) and a large resolution requires a lens design having multiple large lens elements disposed in a vertically stacked configuration and a heavy structure to support them, making the camera bulky and expensive to manufacture. Additional requirements that the camera have a field of view approaching 180 degrees further increases the complexity of the lens design.
FIG. 2 depicts a prior art camera 201 optimized for sensing visual motion or optical flow in one direction. This camera 201 is described at length in U.S. Pat. No. 6,194,695 incorporated herein by reference in its entirety. This camera 201 comprises an iris 203, an optional lens 205, cavity 219, a focal plane chip 207, an analog to digital converter (ADC) 209, and a digital computer 211 which generates an output 217. The iris 203 and lens 205 focus light onto the focal plane 207 in a manner that preserves visual information along one axis. The lens 205, at a predetermined distance from the focal plane 207 forming cavity 219, may be placed “out of focus” with respect to the focal plane chip 207 to optically smooth the image formed on the focal plane chip 207. The focal plane chip 207 generates photoreceptor signals 213, and the digital computer 211 contains an algorithm 215 that acquires these photoreceptor signals 213 and processes them to compute a linear optical flow measurement. This measurement forms the output 217.
The camera 201 of FIG. 2 may be simplified by removing the lens 205. In this manner, the iris 203 is effectively an elongated pinhole, which causes individual photoreceptor circuits on the focal plane chip 207 to have a rectangular response to the visual field. This causes the image focused on the focal plane chip 207 to be smoothed along the long dimension of the iris 203, which preserves information in the perpendicular direction. The photoreceptor circuits may also be shaped as elongated rectangles oriented in the same direction as the iris to increase light sensitivity, as shown in FIGS. 4A and 4B of the aforementioned U.S. Pat. No. 6,194,695.
The computer 211 generates an optical flow measurement based on the photoreceptor signals 213 and sends the optical flow measurement to the output 217. Optical flow represents the relative motion between a camera and other objects in the environment. Algorithms for measuring optical flow between two successive images are well known in the art. The output of such algorithms may be a measurement of, for example, how many pixels or fractions of a pixel the texture appeared to move between two sequential images. Sample optical flow algorithms include Srinivasan's Image Interpolation Algorithm and the Lucas Kanade algorithm, both of which are referenced below.
The camera of FIG. 2 has the same disadvantages as does the camera of FIG. 1 described above.
It is desirable to implement cameras and vision sensors that overcome some of the aforementioned disadvantages and limitations. In particular, it is desirable to have a camera structure that is able to acquire a high resolution image over a large field of view but have a shape that has a low profile and is effectively flat. FIG. 3 depicts a prior art “TOMBO” camera 301 described in U.S. Pat. No. 7,009,652 which is incorporated herein by reference in its entirety. The acronym TOMBO stands for “Thin Observation Module by Bound Optics”. The camera 301 comprises a lens array 303, a restricting structure 311, and an image sensor 305. The image sensor 305 contains a pixel array 313 located at the focal plane of the lenses of lens array 303. Instead of using a single lens as shown in FIG. 1, the camera 301 of FIG. 3 utilizes lens array 303 to form an array of images on the pixel array 313. A single aperture unit 307 comprises a single lens and its corresponding set of pixels (which may be referred to as its subimage 309) on the image sensor 305, and is similar to the camera 101 of FIG. 1. Restricting structure 311 isolates adjacent aperture units and prevents light from crossing over between adjacent single aperture units. The restricting structure 311 has a predetermined thickness and forms a cavity between each individual lens element and corresponding portion of the pixel array 313 that captures a subimage. The image sensor 305 grabs the resulting subimages, which will appear as a tiling of low resolution images generated from the visual field. This tiling of images obtained by the image sensor 305 may be referred to as a “raw image” for purposes of discussion.
A processor, not shown, contains an algorithm that extracts the subimages from the pixel array 313 and reconstructs a high resolution image of the visual field. The algorithm exploits the fact that the individual subimages generated by each aperture unit are similar but not exactly the same, since each lens may be laterally offset from the pixel array 313 on the focal plane by a different sub-pixel amount. The algorithm proposed by Tanida et al. models the camera 301 asy=Hx  (1)where x is a vector that represents the visual field, y is a vector that represents the raw image captured by the pixel array, and H is a matrix that models the transfer function implemented by the camera 301. The vector x may be an ideal high resolution image that would be captured by the conventional camera structure shown in FIG. 1. The purpose of the algorithm is thus to reconstruct x from raw image y. H may be determined through a combination of analytical and empirical analysis, obtaining a pseudoinverse matrix H* of the transfer function H, and computingx=H*y  (2)to reconstruct the high resolution image x representing the visual field from y.
The camera 301 shown in FIG. 3 has the advantage of being able to acquire higher resolution images from a thinner optical structure relative to that described above for FIGS. 1 and 2. For example, the light gathering ability of a single low F-stop lens is obtained instead through the distributed light gathering ability of the lens array 303. However in spite of a perceived elegance of this apparatus, it suffers from two particular disadvantages. First, the lens array 303 and restricting structure 311 are complex and may be difficult to manufacture inexpensively. It also suffers from a bulky structure. Second, the proposed method of reconstructing the high resolution image x from y requires both an accurate knowledge of the transfer function H and a significant number of computations to compute Equation (2). These weaknesses may limit the utility of the camera 301 in many practical applications.
FIG. 4 illustrates Snell's Law, a fundamental law of optics that dictates how a ray of light 401 will travel when it passes between two different transparent mediums. In FIG. 4, the ray of light 401 originates in a first medium 403, passes through a second medium 405, and exits back into the first medium 403 on the other side. Let the index of refraction of the first medium 403 be n1 and the index of refraction of the second medium 405 be n2. Let θ1 and θ2 be the respective angles of incidence of the ray 401 as it passes across the boundary 407 between the two mediums, as shown in the figure. The angle of incidence of a ray is defined as the angle between that ray and normal 408, with normal 408 being perpendicular to the boundary 407 between the two mediums. Snell's Law dictates that:
                                                        n              1                        ⁢            sin            ⁢                                                  ⁢                          θ              1                                =                                    n              2                        ⁢            sin            ⁢                                                  ⁢                          θ              2                                      ⁢                                  ⁢        or                            (        3        )                                          sin          ⁢                                          ⁢                      θ            2                          =                                            n              1                                      n              2                                ⁢          sin          ⁢                                          ⁢                                    θ              1                        .                                              (        4        )            
In the case of FIG. 4, the index of refraction of the second medium 405 is higher than that of the surrounding first medium 403. For example, the first medium 403 may be air while the second medium 405 may be plastic or glass. As a result, the angle θ2 will be less than θ1. One important observation is that if the second medium 405 has a higher index of refraction than the first medium 403, the value |sin θ2| is bounded by the value n1/n2, since sin θ1 cannot exceed one in magnitude. As a result, θ2 cannot be larger than an angle called a critical angle, which is denoted by θc:
                                                                    θ              2                                            <                      θ            c                          =                              sin                          -              1                                ⁢                                    n              1                                      n              2                                                          (        5        )            
The phenomena of the critical angle will have application in the teachings that follow. From the point of view of an observer inside the second medium, the hemisphere of visual field on the first medium's side of the boundary 407 will be compressed to a cone having an angular diameter of 2θc. This cone is often referred to as “Snell's window”. This phenomena can be observed, for example, from underwater in a swimming pool by looking outward at the world above.