A composite image is generally created by combining a series of overlapping images, thus representing a scene having a larger horizontal and/or vertical field of view than a standard image. For example, a series of individual images captured by an image capture device can be combined to form a panoramic image of the skyline, of the horizon, or of a tall building.
In order to properly combine (or stitch together) the individual images, the overlapping features at the edges of the images must be matched. Matching techniques range from the relatively simple to the relatively sophisticated. An example of the former includes manually aligning a series of images in a two-dimensional space until the edges are considered properly lined up, as interpreted by the naked eye.
A more sophisticated technique uses computer-implemented stitching software to calculate a best fit between adjacent images. Stitching software is widely commercially available. The best fit can be calculated in a variety of ways. For example, pattern detection can be used to match image features, either automatically (e.g., without manual intervention) or quasi-automatically (e.g., first manually aligning the images to provide a coarse fit, then refining the fit by computer).
In general, determining the positions of best fit can be challenging because when a three-dimensional scene (i.e., the overall scene being photographed) is mapped to multiple two-dimensional planes (i.e., the collection of individual images), pixels in different images may no longer match in the two-dimensional planes. That is, image features may be stretched, shrunken, rotated, or otherwise distorted from one image to another.
Such distortions often arise because of differences in the poses (or orientations) of the image capture device from shot to shot. For example, two images captured side by side may not be exactly aligned, with one being incrementally pitched (e.g., due to the photographer leaning forward or backward) relative to the other. Or, one image may be incrementally rolled (e.g., due to the photographer leaning to the left or right) relative to the other. Or, one image may be incrementally yawed (e.g., due to the photographer panning the image capture device horizontally) relative to the other. More generally, the pose of the image capture device can vary from shot to shot due to a combination of such pitching, rolling and/or yawing.
In light of the foregoing, a market exists for a technology to determine the pose (e.g., expressed in pitch, roll, and/or yaw terms or otherwise) of an image capture device when capturing an image to facilitate generation of composite images.