As the image synthesizing transforming system for synthesizing/transforming the images picked up by a plurality of real cameras in the related art, there is the system set forth in International Publication WO00/64175, for example. This system will be explained with reference to FIG. 10 hereunder.
The image synthesizing transforming system in the related art is configured to have an imaging means 110 and an image processing portion 120. The imaging means 110 includes a plurality of cameras 111, 112 and frame memories 113, 114 corresponding to respective cameras 111, 112. Images input from respective cameras 111, 112 are written into the corresponding frame memories 113, 114.
The image processing portion 120 includes an image synthesizing means 121, a mapping table looking-up means 122, and a video signal generating means 123. The mapping table looking-up means 122 includes a transformation address memory 131 for storing transformation addresses (mapping table) indicating correspondences between position coordinates of output pixels and position coordinates of input pixels, and a degree-of-necessity memory 132 for recording degree-of-necessities of respective input pixels at that time.
The image synthesizing means 121 generates data of the output pixels by adding data of respective pixels in the frame memories 113, 114 according to designated degree-of-necessities, based on the transformation addresses (mapping table) recorded in the mapping table looking-up means 122. The video signal generating means 123 outputs the data of the output pixels generated by the image synthesizing means 121 as an image signal. In this case, above processes are carried out based on an appropriate synchronizing signal such as an input image signal, or the like, for example.
The image synthesizing means 121 implements in real time to synthesize the images input from two different cameras 111, 112 in compliance with the mapping table looking-up means 122, synthesize smoothly the input images from a plurality of different cameras by generating the output image while changing the pixel positions, and transform the input images into the image viewed from the virtual viewpoint. However, in order to execute the image synthesis in real time, it is required that the mapping table used in the image synthesis is previously recorded in the mapping table looking-up means 122.
Next, procedures of forming the mapping table will be explained hereunder. In order to form the mapping table, coordinates of the pixels of respective camera images corresponding to respective pixels of the synthesized image viewed from the virtual viewpoint (installing position of the virtual camera) must be decided. Procedures of deciding this correspondence are classified into two phases consisting of a phase in which positions of points on a global coordinate system, which correspond to respective pixels of the synthesized image viewed from the virtual viewpoint, are calculated and a phase in which coordinates of the pixels on the real camera, which correspond to calculated positions of the points on the global coordinate system, are calculated.
In this case, the relationships recorded finally on the mapping table are only relationships between respective pixels of the synthesized image viewed from the virtual viewpoint and pixels of respective camera images (real images). The procedures of forming the mapping table are not limited to the system that is executed via the points on the above global coordinate system. However, the mapping table formed via the points on the above global coordinate system is excellent in respect of formation of the synthesized image, by which surrounding circumstances can be correlated easily with actual distances and positional relationships, since the meaning of the synthesized image on the global coordinate system as the coordinate system in the real world can be made clear.
A relationship between the pixel position [mi]=(xi,yi) of the virtual camera and the camera coordinate [Pi]=(Xi,Yi,Zi) of the virtual camera are defined as follows.xi=xi/zi (where Zi is not 0)yi=Yi/Zi (where Zi is not 0)
The transformation from the camera coordinate [Pi] of the virtual camera to the global coordinate [Pw] is executed by using three-dimensional rotation [Ri] and transformation [Ti] as follows.[Pw]=[Ri][Pi]+[Ti]
Similarly, the transformation from the global coordinate [Pw] into the camera coordinate [Pr] of the real camera is executed by using three-dimensional rotation [Rr] and transformation [Tr] as follows.[Pr]=[Rr][Pw]+[Tr]
The transformation from the camera coordinate system of the virtual camera to the global coordinate system and the transformation from the global coordinate system to the camera coordinate system of the real camera are schematically shown in FIG. 11. That is, an image M represented by the camera coordinate system C of the virtual camera and an image M′ represented by the camera coordinate system C′ of the real camera are correlated with each other via a global coordinate system O of the image.
Also, the transformation from the camera coordinate [Pr]=(Vxe, Vye, Vze) of the real camera to a two-dimensional coordinate [Mr]=(xr, yr) of the real camera on the viewing screen is executed based on the perspective projection transformation by using a focal length fv as follows.
                    xr        =                              (                          fv              /              Vze                        )                    ·          Vxe                                                  y          ⁢                                          ⁢          r                =                              (                          fv              /              Vze                        )                    ·          Vye                    
The position obtained by transforming this coordinate into the unit of the pixel and correcting the position in light of a lens distortion of the real camera corresponds to a position of the pixel by the real camera. In order to correct the lens distortion, there are the system for utilizing a table in which relationships between a distance from a lens center and an amount of correction are recorded, the system for approximating by using a mathematical distortion model, etc.
At this time, since a three-dimensional profile of the subject existing on the global coordinate system is unknown, a scale factor λ (λ is a real number except 0) of [Pi] becomes indefinite in the transformation from the pixel position [mi] of the virtual camera to the camera coordinate [Pi] of the virtual camera. That is, in FIG. 12, all points on a straight line l, e.g., a point K and a point Q, are projected on the same pixel position X (xi, yi). Therefore, one point on the straight line l is decided by assuming an appropriate projection model as a profile of the object that is viewed from the virtual viewpoint. That is, an intersection point between the projection model and the straight line l is set as a point on the global coordinate system.
In this case, a Zw=0 plane, etc. on the global coordinate system, for example, may be considered as the projection model. If the appropriate projection model is set in this manner, correspondences between respective pixels [Pi] on the synthesized image viewed from the virtual point and the pixels [Pr] on the real camera image can be computed by the above procedures.
In order to compute these correspondences, a great deal of computation, e.g., coordinate calculation of the points on the projection model, transformation between the camera coordinate system and the global coordinate system, computation to decide onto which camera the coordinate on the projection model is projected if the number of cameras is large, etc. is needed.
Meanwhile, a request for the image having a wide visual field is enhanced as a monitor camera used for the purpose of the monitor, a car-equipped camera used for the purpose of the driving assistance, etc., are spread widely. Therefore, it is requested that an image picked up by a sole camera using a fisheye lens, or the like on behalf of a super-wide-angle lens or images picked up by a plurality of cameras are synthesized/transformed to provide the image which can be viewed as if such image is picked up by one camera. Also, there appear nowadays the applications in which only a necessary area is extracted from the image having a wide visual field, deformed and displayed, the image is transformed in the pseudo-image picked up by the virtual camera and displayed, etc.
In order to execute such synthesis/transformation of the image by applying the above related art, a large amount of computation is needed as described above. For this reason, a computing unit having a huge computation power is required to execute the computation in real time, and such synthesis/transformation is not practical. As a result, the mainstream is the system for recording correspondences between input images and output images as a mapping table by executing the computation previously and then synthesizing/transforming the image in real time while looking up the mapping table.
In order to utilize the previously-computed mapping table, since the mapping table depends on an installation position of the actual camera, the actual camera must be set exactly at the same position as the installation position of the camera that was used at the time of computation of the mapping table. However, it is possible to say that this approach is not so practical. Also, if the installed position of the camera is displaced due to any cause in the course of use after the camera could be installed exactly, the camera must be restored to the original installation position, and this approach is also not practical.
It is possible to say that, since it is not practical to adjust physically the installation position of the actual camera in this manner, preferably the mapping table should be computed after the camera is installed. In this case, if the mapping table is computed in the inside of the image synthesizing transforming equipment, a high-performance computing unit that can execute a huge mount of computation is needed. However, since the high-performance computing unit is not used ordinarily after the mapping table has been computed, it is possible to say that this approach is also not practical.
Also, if the mapping table is computed by an external high-performance computing unit, the computed mapping table must be transferred to the inside of the image synthesizing transforming equipment from the external unit. For example, if the image synthesizing/transforming system is built in the device in the vehicle, or the like, it is not practical to install the dedicated interface, which is used to transfer the mapping table but not used ordinarily, to the outside of the device.
Accordingly, it is expected that the device having the previously-set external interface is used together. In this case, the mapping table needs the data transmission of (number of pixels)×(mapping data capacity per pixel), and thus a high-speed transferring circumstance is needed. There is the CAN BUS nowadays as the interface that can execute the data transmission of the vehicle. This interface intends to transfer the control data and does not intend to transfer the large data like the mapping table. Thus, it is impossible to say that this approach is practical.
The object of the present invention is to provide an inexpensive image synthesizing transforming system which enables to calculate a mapping table without a computing unit of high performance after cameras are installed, and has a wide versatility and needs easy maintenance.