The 3D reconstruction of a scene consists in obtaining, on the basis of successive 2D images of this scene taken from different viewpoints, a so-called 3D reconstructed image such that with each pixel of the reconstructed image, that is to say at any point where the reconstruction declares that there is a scene element, are associated the coordinates of the point of the corresponding scene, defined in a frame X, Y, Z related to this scene.
Conventional mosaicing, so-called 2D mosaicing, consists on the basis of successive images of a scene in projecting them successively onto a principal plane of the scene and in assembling them to produce a mosaic thereof.
Techniques for passive 3D scene reconstruction on the basis of cameras are described in various reference works:                R. Horaud & O. Monga. Vision par Ordinateur: Outils Fondamentaux, Editions Hermés, 1995. http://www.inrialpes.fr/movi/people/Horaud/livre-hermes.html        Olivier Faugeras. Three-Dimensional Computer Vision, MIT Press, 1993        Frédéric Devernay, INRIA Grenoble, course “Vision par ordinateur 3-D”. http://devernay.free.fr/cours/vision/        Tébourbi Riadh, SUP′COM 2005 IMAGERIE 3D 08/10/2007        “Learning OpenCV: Computer Vision with the OpenCV Library”, Gary Bradsky, 2008        
These works all cite techniques for 3D scene reconstruction on the basis of pairs of stereoscopic images originating from cameras positioned at different viewpoints, which may either be fixed cameras positioned at various sites in space, or a camera whose position varies temporally, always with the same basic principle of matching the images of the cameras taken 2 by 2 to form a stereoscopic 3D reconstruction of the portion of space viewed by the cameras.
They also explain the principle of epipolar rectification where the focal plane image of each camera is rectified according to the attitude of the camera on a so-called rectification plane so as to facilitate the matching between the images of the stereoscopic pair and enable the 3D reconstruction. The method is relatively optimized by various authors but always relies on the principle that it is firstly necessary to correct the optical distortions of the camera and thereafter to use the relative attitudes of the 2 cameras to determine the rectification plane on the basis of which the matching and the 3D reconstruction are performed.
Other techniques of passive 3D reconstruction exist in the literature, for example the so-called silhouetting techniques, not considered here since they apply to particular cases and require prior knowledge about the scene.
In the techniques of active reconstruction of a scene, it is possible to cite those based on lidar which make it possible to reconstruct the 3D mesh of the scene directly by a distance computation.
Among the reference works may be cited:                MATIS studies for the IGN: “Using Full Waveform Lidar Data for Mapping of urban Areas”, Doctoral thesis, Clément Mallet, 2010        “Couplage de Données Laser Aéroporté et Photogrammétriques pour l'Analyse de Scénes Tridimensionnelles”, Doctoral thesis, Frédéric Bretar, 2006.        
An interesting article shows that these techniques have limits in reconstructing 3D objects of complex shape (for example concave): Structuration plane d'un nuage de points 3D non structuré et détection des zones d'obstacles, Vision interface conference, 1999, Nicolas Loémie, Laurent Gallo, Nicole Cambou, Georges Stamon.
Concerning mosaicing, the following reference works may be cited:                L. G. Brown, “A Survey of Image Registration Techniques”, in ACM Computing Surveys, vol. 24, n° 4, 1992,        “Mosaïque d'images multiresolution et applications”, Doctoral thesis, Université de Lyon. Lionel Robinault, 2009.        
If one summarizes the prior art relating to 3D reconstruction, it may be said that 3D reconstruction may be partially obtained by using:                Pairs of cameras producing a spatially stereoscopic image of the scene and by fusing these images to produce a 3D reconstruction and optionally a mosaicing of the scene. This solution exhibits several drawbacks:                    the cameras are difficult to calibrate (problems of vibration),            an inaccuracy in restitution of the 3D reconstruction on account of a stereo base limited by the spacing between the cameras,            low-field and low-extent restitution on account of the limited optical field of the cameras.                        
Moreover, the finalized 3D reconstruction is not obvious, since it is constructed by assembling local 3D reconstructions (resulting from the method of stereoscopic restitution of 2, often small-field, images) which may be very noisy on account of the limited number of images which made it possible to construct it, of the limited field of the cameras and of the fact that the reconstruction planes dependent on the respective attitudes of the cameras have a geometry that is difficult to measure accurately (the relative position and relative geometry of the cameras serving to do the 3D reconstruction is often inaccurate in practice when dealing with cameras which are 1 or 2 meters apart and liable to vibrate with respect to one another: this is still more evident when these cameras are motorized). The precise way of assembling the intermediate 3D reconstructions is never described in detail and in practice many errors are noted in the finalized 3D reconstruction which in any event remains small in spatial and angular extent (typically less than 200 m×200 m in spatial extent with an angular extent of typically less than 30°).
Finally the rectification and matching method itself, dependent on the attitudes of the cameras and entailing a preliminary step of derotation of the focal plane in the rectification process, implies that typical cases exist where the 3D reconstruction exhibits holes, especially if the system exhibits temporal rotation motions.
Lastly, the stereoscopic system restores poorly planes which are almost perpendicular to one of the 2 cameras (this is the problem of the restitution of pitched roofs in aerial or satellite stereoscopic imaging).                A moving low-field or mean-field camera, but the 3D reconstruction is limited by the path and the orientation of the camera and is therefore not omnidirectional; moreover, the reconstruction may exhibit holes on account of unchecked motions of the camera or non-overlaps of the latter in the course of its motion. The algorithms used for 3D reconstruction impose a reconstruction in a frame tied or close to the focal plane of the camera, thereby limiting the possibilities of reconstruction (a single principal reconstruction plane and very limited reconstruction when the camera changes orientation). The result of the reconstruction is also very noisy and may exhibit numerous errors on account of the small overlap between images, of a constant plane of reconstruction of the reconstructed scene (and of a camera that could deviate from this plane) and of the use of algorithms which for the 3D reconstruction utilize only two images separated by a relatively small distance. The mosaicing obtained by the ground overlaying of the successive images is inoperative and is not conformal when the scene is not flat and/or comprises 3D elements.        Active sensors, that is to say with telemetry, but here again the 3D reconstruction is not omnidirectional and is not necessarily segmented, the measurements being obtained in the form of scatters of points that are difficult to utilize in an automatic manner. Moreover, the mesh obtained by these active sensors exhibits the drawback of being angularly non-dense (typically fewer than 4 points per m2 for airborne applications at 1 km height). The technique is not at the moment suitable for being able to produce a textured image of the scene and must almost always be corrected manually.        
All the previous solutions are unsuitable for obtaining a 3D mosaicing or a 3D reconstruction for a 3D scene of large dimension, that is to say greater than 500 m×500 m. The 3D instantaneous mosaics obtained exhibit deformations and are limited in angular extent (typically <30°) or spatial extent. The assembling of the mosaics is complex when the terrain is 3D and the final result does not conform to the geometry of the scene.
The drawbacks of the procedures of the prior art are not limiting; other drawbacks are described in the patent.