Metric 3D scene geometry reconstruction involves generating points that map out scenes in 3D space using sets of 2D images depicting the scenes from various vantage points. For example, 3D scene geometry reconstruction has been used to generate points that define buildings and other physical structures in a geographic area (example scene) using aerial photographs of the geographic area (example 2D image set), such as photographs taken by drones flying over a geographic area.
Metric 3D scene geometry reconstruction has been performed using a variety of inputs, including reference ground points, camera extrinsics, camera intrinsics, and georegistration information. “Camera extrinsics” refers the position of the camera within 3D space and the camera's orientation relative to a coordinate frame axis. “Camera intrinsics” refers to the radial distortion of the camera's lens, the focal length of camera, the principal point, and other optical factors affecting the image captured by a camera. Reference ground points are points in 3D space for features within a scene that can be correlated across multiple images and can be used to triangulate the relative pose of the cameras viewing the features. Georegistration information includes information that identifies the geographic location at which was captured, such as GPS coordinates.
Methods for reconstructing metric 3D scene geometries from collections of 2D images on small scales (e.g., small number of images included in a source image set) have used a variety of techniques, such as structure-from-motion techniques and other photogrammetric reconstruction techniques. These techniques have exhibited high-order polynomial time algorithmic growth which has made them inefficient when applied to large scale image sets (e.g., “internet-scale” image sets).