FIG. 1 depicts a generic camera system 10, that acquires and forms an image 20 of a typically three-dimensional target object 30 located at distance Z from the camera system. Camera system 10 typically includes a lens or optical system 40 through which optical energy rays 50 from the target object pass, to form image 20 on an array or other medium 60. For purposes of the present invention, camera system 10 (or more simply, camera 10) can include any type of image forming camera, and may, for example, include two-dimensional focal plane arrays, scanned linear arrays, scanned single pixel configurations. Camera system 10 may utilize any imaging modality and wavelength, including without limitation, radar, visible or IR light, acoustic energy, etc. Thus, in the broadest sense, “distortion” as used herein need not be restricted to distortion created by an optical lens in a camera system under calibration. Further, camera system 10 may include three-dimensional range or time-of-flight (TOF) cameras such as disclosed in U.S. Pat. No. 6,323,942 (2001) CMOS-Compatible Three-Dimensional Imaging System, as well as two dimension intensity or RGB cameras. Indeed, camera system 10 may be implemented as an analog or digital electronic camera, as well as a film-based camera, whose image has been scanned into electronic form.
But to utilize a camera as a device to measure geometry of a three-dimensional scene, e.g., target object 30, it is important to accurately calibrate the interior orientation of the camera, namely the relationship between every point in the camera-captured image 20, and optical energy rays 50 in three-dimensional space exterior to the camera. For example, plane or medium 60 may be defined as a plurality of points, or pixels, each point uniquely identifiable in terms of coordinates (xi,yi). In an ideal camera system, there would be a perfectly linear 1:1 mapping between each pixel location on plane 60, and a corresponding portion (Xi,Yi) of object 30. Indeed, a uniquely identifiable vector ray (or boresight line) could be drawn between portions of object 30, and the pixel locations on plane 60 receiving intensity (or color) information from that object portion. Stated differently, in a perfectly calibrated system, there would be a known linear association between every pixel location on plane 60 with a boresight angle or vector direction towards a corresponding point on the calibration target object.
But in practice, identifying points in the captured image and locating these points in the target can be time consuming. While the distance Z is known, the relationship between points or pixels (xi,yi) in the captured image, and points (Xi,Yi) in the target is not known. One could of course determine individual points in the target, one-by-one, and determine their pixel location counterpart in the captured image. The results of such time consuming calibration could then be used to construct a look-up-table that correlated real-world target location coordinates (Xi,Yi) of any point on the target that is imaged, to each pixel location (xi,yi) in the captured image. Thereafter when imaging other targets at other distances Z, the look-up-table could be consulted to learn where in the real-world individual pixel locations refer. Generally, camera system imperfections, perhaps due to imperfect optics 40, result in a mapping that is not perfectly linear. The result is distortion in image 20 captured by camera 10 from object 30, with distortion generally being more severe at the corners than at the center of the captured image. It is a function of a calibration system to try to arrive at a mapping that preferably can also be used to correct for such distortions. It will be appreciated that while lens 40 typically distorts the captured image, the nature of the distortion will generally be a smooth function between adjacent pixel locations in the camera-captured image 20.
In calibrating a camera, typically a picture is taken of a planar calibration target object with known pattern at a known distance Z from the camera. If calibration were perfect, or if optical lens 40 were perfect there would be a perfectly linear 1:1 mapping between every pixel (xi,yi) in plane 60 associated with camera 10, and with every point (Xi,Yi) in the calibration target. In reality, distortion, e.g., from lens 40, occurs. In FIG. 1, one might use as target object 30 a planar calibration target, perhaps having a pattern of parallel horizontal lines. But such a calibration target cannot be used satisfactorily to arrive at a desired linear mapping at all points, as essentially no useful calibration information is obtained regarding calibration target points removed from the various parallel horizontal lines. A second target, perhaps with parallel vertical lines, could contribute additional calibration information, as might using a third calibration target with concentric circles. But using multiple calibration targets to obtain additional calibration data would of course lengthen the time needed to accomplish calibration.
To more efficiently implement calibration in obtaining a desired interior orientation, it is preferred to estimate a dense correspondence field between points (Xi,Yi) in a known calibration target pattern and points in pixels (xi,yi) in the camera-captured image 20 formed of the target pattern on plane 60. But when optics 40 are not ideal and exhibit distortion, the desired correspondence is not necessarily a projective transformation or other simple parametric model. In general, the desired correspondence requires representation as a flow field.
Several automatic methods for estimating the correspondence field are known in the art. If the target pattern contains distinct features, feature extraction and matching may be used to provide a sparse mapping that can be interpolated into a dense field. If the target pattern contains rich texture, general image-to-image optical flow estimation techniques can be used.
But such prior art techniques have several drawbacks, including sensitivity to orientation and scale of the pattern. Frequently prior art calibration techniques are simply too sensitive to variations in brightness and contrast, and distortion in the camera lens. In short, such prior art techniques cannot always reliably provide a desired dense correspondence under practical environmental conditions.
What is needed is a method and system to calibrate a camera system, with substantial independence from orientation and scale of the calibration pattern, variations in brightness and contrast, and optical component imperfections. Calibration should include imaging a calibration target with a pattern that rapidly facilitates spatial calibration of the camera under calibration, such that pixels or points (xi,yi) in the captured image can be identified and located with respect to real-world coordinates (Xi,Yi) in the calibration target. Use of a suitable target facilitates construction of a mapping that relates each pixel in a capture image to a real-world coordinate of a target object, a distance Z away. Since most camera systems introduce optical distortion, preferably some knowledge of the distortion characteristics of the camera system under calibration can be initially determined. This a priori distortion information is then used to create a calibration target pattern that preferably is pre-distorted, such that the camera-captured image will be substantially undistorted. Analysis of distortion in the camera-captured image should enable relatively rapid and accurate linear mapping with a dense correspondence.
The present invention provides such a method and system.