The ability to generate, capture, process, and render data on computers, namely portable phones, has evolved with the advancement of silicon chip manufacturing. The ability to generate a 3D model of an object from a series of 3D images has evolved with the advancement of image capture technology and related data processing. Detailed 3D models are used in a variety of applications, such as human modeling, augmented reality, virtual reality, spatial modeling, portable mapping and video games. The models are usually created with image frame data obtained from an image capture device, such as a digital camera or a depth sensor.
Those skilled in the art have developed multi-view photogrammetric structure from motion (“SFM”) 3D reconstruction methods which use video image data, more fundamentally a series of still photos, to estimate 3D geometry of a landscape or a space. In general, the method matches corresponding points between each successive 2D image and records the relative positions of the image recording device. This process is also known as photogrammetry.
Multiple successive images taken using SFM, stereo-vision, time-of-flight (“ToF”) depth sensors, structured light depth sensors, light detection and ranging (“LIDAR”) or any other depth-sensing technologies may be used to create a disparity map or an approximated 3D point cloud. Points within the 3D point cloud are connected to create a surface structure commonly known as a 3D mesh. Various methods are known to extrapolate a 3D model of the target subject from the 3D mesh, some applying a texture based on high-resolution image data.
The 3D mesh generation process typically uses one or more methods to incorporate the image capture device's location and orientation to improve point cloud accuracy. The process may also incorporate standard color camera data which are then projected onto the final 3D model.
Various software applications may be employed to perform 3D reconstruction by standard 2D color image analysis, such as Acute3D, Agisoft Photoscan, Autodesk 123D Catch, Sketchup, and VisualSFM (registered trademarks). Reconstruction of a 3D mesh from images may also be achieved by RGBA-Depth analysis using software applications such as ReconstructMe, Occipital's Skanect and Structure SDK, Geomagic, Artec Studio, Microsoft's SDK for Kinect, or Microsoft's MobileFusion (registered trademarks).
Hardware and software designed to reconstruct 3D models has become significantly more sophisticated over the last couple of decades. The hardware required to capture the data and process it with 3D reconstruction software has become more compact and portable. Likewise, software applications for processing and rendering 3D models have become more efficient and effective at producing very detailed and realistic 3D models. The hardware in most flagship smartphones is powerful enough to run the sophisticated 3D reconstruction software, but the typical native software often crashes due to incompatibility with diverse hardware configurations.
Moreover, the brand-specific hardware used in many consumer-based computing products typically requires a specific software platform which is not compatible with other consumer brands' hardware. Therefore, a software advancement such as Microsoft's Kinect Fusion™ may not operate on a different brand's device without adapting its code. Such brand-specific development limits a consumer's ability to enjoy many technological advancements.
The general approach to processing 3D image-related data employs sequential processing logic, and is coupled to a specific hardware configuration. Data is acquired from one or multiple different sensors and input into certain algorithm computing modules (“modules”). Each module sequentially processes a different type of data, such as depth map data, tracking data or color image data, at different stages of the 3D reconstruction process. The device may employ a Central Processing Unit (“CPU”), a Graphics Processing Unit (“GPU”), an Application Specific Integrated Circuit (“ASIC”), a Digital Signal Processor (“DSP”), Field-Programmable Gate Array (“FPGA”), or a combination to compose a final 3D mesh. Standard CPUs in typical smartphones and tablets are too slow to process data in real-time without reducing resolution and/or speed of reconstruction. Consequently, current 3D reconstruction software applications that primarily employ CPUs often cause a standard, consumer-oriented portable computing device to “crash” due to a system timeout. In the alternative, the software might not run at all because the software requires more memory or higher processor speed than is available. In such cases the data must be uploaded to a render server, which the average consumer may not be able to access or know how to operate.
There is a need to be able to record and process continuous, real-time depth image and/or video capture data on the average hand-held portable device, and simultaneously generate a 3D model on the device without the aid of ancillary or external processing devices. A “ring buffer” data structure and “thread-safe processing pipeline” approach has been developed to fulfill this need and is disclosed in one embodiment of the invention. Under this approach, the modules within the 3D reconstruction system have been decoupled and ring buffers have been inserted. Each module may enqueue data into a connected ring buffer without sending the data directly to another module. Data may be temporarily stored in the corresponding ring buffer until one of the modules dequeues the data for use in processing a subsequent or corresponding portion of the 3D model. A multitude of algorithms are commonly known in the art for enqueueing and dequeuing data to and from ring buffers, and no specific algorithm is required for the invention to operate as intended.
The system may process a large amount of data in multiple threads without crashing. If a module becomes “blocked” (i.e., “backed-up” or “starved” of data), other modules may continue to receive, process, and enqueue data to buffers. The ability to run modules in separate threads permits continuous data processing, which will allow a “blocked” module to resume processing and prevent a crash.
Furthermore, the system may run 3D reconstruction software on a diverse suite of devices without re-compiling the software due to its versatile data structures, processing pipeline, and module implementations. Software developers may employ the invention to implement their algorithms across a diverse range of different hardware and software platforms, which will promote broader access to cutting-edge software developments.