1. Field of the Invention
The presently claimed and disclosed invention(s) relate to mosaicked oblique images and methods for making and using same. More particularly, the presently claimed and disclosed invention(s) use a methodology whereby separate obliquely captured aerial images are combined into at least one single oblique mosaic image. The at least one single oblique-mosaic image is visually pleasing and geographically accurate.
2. Background of the Art
In the remote sensing/aerial imaging industry, imagery is used to capture views of a geographic area and be able to measure objects and structures within the images as well as to be able to determine geographic locations of points within the image. These are generally referred to as “geo-referenced images” and come in two basic categories:                1. Captured Imagery—these images have the appearance they were captured by the camera or sensor employed.        2. Projected Imagery—these images have been processed and converted such that they conform to a mathematical projection.        
All imagery starts as captured imagery, but as most software cannot geo-reference captured imagery, that imagery is then reprocessed to create the projected imagery. The most common form of projected imagery is the ortho-rectified image. This process aligns the image to an orthogonal or rectilinear grid (composed of rectangles). The input image used to create an ortho-rectified image is a nadir image—that is, an image captured with the camera pointing straight down.
It is often quite desirable to combine multiple images into a larger composite image such that the image covers a larger geographic area on the ground. The most common form of this composite image is the “ortho-mosaic image” which is an image created from a series of overlapping or adjacent nadir images that are mathematically combined into a single ortho-rectified image.
Each input nadir image, as well as the output ortho-mosaic image, is composed of discrete pixels (individual picture elements) of information or data. As part of the process for creating an ortho-rectified image, and hence an ortho-mosaic image, an attempt is made to reproject (move within a mathematical model) each pixel within the image such that the resulting image appears as if every pixel in the image were a nadir pixel—that is, that the camera is directly above each pixel in the image.
The reason this ortho-rectification process is needed is it is not currently possible to capture an image where every pixel is nadir to (directly below) the camera unless: (1) the camera used is as large as the area of capture, or (2) the camera is placed at an infinite distance above the area of capture such that the angle from the camera to the pixel is so close to straight down that it can be considered nadir. The ortho-rectification process creates an image that approximates the appearance of being captured with a camera where the area on the ground each pixel captures is considered nadir to that pixel, i.e. directly below that pixel. This process is done by creating a mathematical model of the ground, generally in a rectilinear grid (a grid formed of rectangles), and reprojecting from the individual captured camera image into this rectilinear grid. This process moves the pixels from their relative non-nadir location within the individual images to their nadir positions within the rectilinear grid, i.e. the image is warped to line up with the grid.
When creating an ortho-mosaic, this same ortho-rectification process is used, however, instead of using only a single input nadir image, a collection of overlapping or adjacent nadir images are used and they are combined to form a single composite ortho-rectified image known as an ortho-mosaic. In general, the ortho-mosaic process entails the following steps:                A rectilinear grid is created, which results in an ortho-mosaic image where every grid pixel covers the same amount of area on the ground.        The location of each grid pixel is determined from the mathematical definition of the grid. Generally, this means the grid is given an X and Y starting or origin location and an X and Y size for the grid pixels. Thus, the location of any pixel is simply the origin location plus the number of pixels times the size of each pixel. In mathematical terms: Xpixel=Xorigin+Xsize×Columnpixel and Ypixel=Yorigin+Ysize×Rowpixel.        The available nadir images are checked to see if they cover the same point on the ground as the grid pixel being filled. If so, a mathematical formula is used to determine where that point on the ground projects up onto the camera's pixel image map and that resulting pixel value is then transferred to the grid pixel. During this selection process, two important steps are taken:        When selecting the image to use to provide the pixel value, a mathematical formula is used to select an image that minimizes building lean—the effect where buildings appear to lean away from the camera. This is accomplished in a number of ways, but the most common is to pick the image where the grid pixel reprojects as close to the camera center, and hence as close to that camera's nadir point, as possible.        When determining the source pixel value to use, the ground elevation is taken into account to ensure the correct pixel value is selected. Changes in elevation cause the apparent location of the pixel to shift when captured by the camera. A point on the ground that is higher up will appear farther from the center of the image than a point on the ground in the same location that is lower down. For instance, the top of a building will appear farther from the center of an image than the bottom of a building. By taking the ground elevation into account when determining the source pixel value, the net effect is to “flatten” the image out such that changes in pixel location due to ground elevation are removed.        
Because the rectilinear grids used for the ortho-mosaic are generally the same grids used for creating maps, the ortho-mosaic images bear a striking similarity to maps and as such, are generally very easy to use from a direction and orientation standpoint. However, since they have an appearance dictated by mathematical projections instead of the normal appearance that a single camera captures and because they are captured looking straight down, this creates a view of the world to which we are not accustomed. As a result, many people have difficulty determining what it is they are looking at in the image. For instance, they might see a yellow rectangle in the image and not realize what they are looking at is the top of a school bus. Or they might have difficulty distinguishing between two commercial properties since the only thing they can see of the properties in the ortho-mosaic is their roof tops, where as most of the distinguishing properties are on the sides of the buildings. An entire profession, the photo interpreter, has arisen to address these difficulties as these individuals have years of training and experience specifically in interpreting what they are seeing in nadir or ortho-mosaic imagery.
Since an oblique image, by definition, is captured at an angle, it presents a more natural appearance because it shows the sides of objects and structures—what we are most accustomed to seeing. In addition, because oblique images are not generally ortho-rectified, they are still in the natural appearance that the camera captures as opposed to the mathematical construction of the ortho-mosaic image. This combination makes it very easy for people to look at something in an oblique image and realize what that object is. Photo interpretation skills are not required when working with oblique images.
Oblique images, however, present another issue. Because people have learned navigation skills on maps, the fact that oblique images are not aligned to a map grid, like ortho-mosaic images, makes them much less intuitive when attempting to navigate or determine direction on an image. When an ortho-mosaic is created, because it is created to a rectilinear grid that is generally a map grid, the top of the ortho-mosaic image is north, the right side is east, the bottom is south, and the left side is west. This is how people are generally accustomed to orienting and navigating on a map. But an oblique image can be captured from any direction and the top of the image is generally “up and back,” meaning that vertical structures point towards the top of the image, but that the top of the image is also closer to the horizon. However, because the image can be captured from any direction, the horizon can be in any direction, north, south, east, west, or any point in between. If the image is captured such that the camera is pointing north, then the right side of the image is east and the left side of the image is west. However, if the image is captured such that the camera is pointing south, then the right side of the image is west and the left side of the image is east. This can cause confusion for someone trying to navigate within the image.
Additionally, because the ortho-mosaic grid is generally a rectilinear grid, by mathematical definition, the four cardinal compass directions meet at right angles (90-degrees). But with an oblique image, because it is still in the original form the camera captured and has not been reprojected into a mathematical model, it is not necessarily true that the compass directions meet at right angles within the image. Because in the oblique perspective, you are moving towards the horizon as you move up in the image, the image covers a wider area on the ground near the top of the image as compared to the area on the ground covered near the bottom of the image. If you were to paint a rectangular grid on the ground and capture it with an oblique image, the lines along the direction the camera is pointing would appear to converge in the distance and the lines across the direction of the camera is pointing would appear to be more widely spaced in the front of the image than they do in the back of the image. This is the perspective view we are all used to seeing—things are smaller in the distance than close up and parallel lines, such as railroad tracks, appear to converge in the distance. By contrast, if an ortho-mosaic image was created over this same painted rectangular grid, it would appear as a rectangular grid in the ortho-mosaic image since all perspective is removed as an incidental part of the ortho-mosaic process.
Because of these fundamental differences in perspective and appearance, the creation of an ortho-mosaic image by the process described above does not work well for oblique images. Because the camera's optical axis (an imaginary line through the center of the lens or optics that follows the aim of the camera) is typically pointed at an angle of 45-degrees or more from nadir (pointed 45-degrees or more up from straight down), the effects of building lean, elevation differences, and non-square pixels are all exaggerated—effects that are considered negative qualities in an ortho-mosaic image. In the ortho-mosaic industry, requirements are generally placed on the image capture process such that they limit the amount of obliqueness to as little as 5-degrees from nadir so as to minimize each of these negative effects.
In addition, if the admirable properties of an oblique image are to be maintained, namely seeing the sides of structures and the natural appearance of the images, then clearly a process that attempts to remove vertical displacements, and hence the sides of the buildings, and one that warps the image to fit a rectilinear grid is not a viable choice. A new process is needed, one which meets the following desirable qualities in an effort to preserve the admirable properties of the oblique image:                If the oblique perspective is to be maintained, the pixels cannot be aligned to a rectilinear grid, or even a trapezoidal grid. Instead, the pixels are preferably aligned to the natural perspective that a camera captures.        As part of the oblique perspective, the pixels in the image cannot all measure the same size on the ground, as pixels in the foreground of the image cover a much smaller area on the ground than pixels in the background of the image—that is by definition part of the natural perspective of a camera.        Because the pixels are so far from nadir, the effects of building lean become extreme and the standard solutions employed in the ortho-mosaic process do not do an adequate enough job compensating for this effect—new techniques must be developed to better compensate for this effect.        If the effects of changes in elevation are backed out, the resulting image has a very unnatural appearance—the vertical sides of buildings can warp and twist, which is something we are not accustomed to seeing and therefore, when looking at such an image, we have a tendency to “reject” it. Thus, to keep the buildings, structures, and objects within an image looking natural, it is preferable to leave the effects of elevation in the perspective of the image and instead account for it in another manner.        
Because of these issues, the common practice in the industry is to provide oblique imagery as a series of individual images. However, some of the same benefits of the ortho-mosaic also apply to an oblique-mosaic (an image created from a collection of overlapping or adjacent oblique images), namely the fact that the mosaic covers a larger geographic area than each or any of the individual images that were used to create it. This invention details a means by which a quality oblique-mosaic can be created, overcoming the above limitations.