Numerous algorithms are available for creating a three-dimensional (3D) model from a set of two-dimensional (2D) images, with each 2D image being taken from a different angle relative to the captured scene (see, e.g., T. Moons, L. Van Gool, and M. Vergauwen, “3D Reconstruction from Multiple Images Part 1: Principles”, Foundations and Trends in Computer Graphics and Vision, Vol. 4, pages 287-404, 2010; and J. Li, E. Li, Y. Chen, L. Xu, “Visual 3D Modeling from Images and Videos”, Technical report, Intel Labs China, 2010). In simple terms, such algorithms perform the reverse process of obtaining 2D images from a 3D model. For instance, “123D Catch” is an iPad app by Autodesk which makes it possible to turn pictures into a 3D model [see, e.g., http://www.123dapp.com/catch].
The 3D model can subsequently be used for rendering images, such as 2D images for arbitrary viewing directions, or images which achieve a stereoscopic 3D effect by encoding each eye's image using filters of different, usually chromatically opposite colours, typically red and cyan (known in the art as ‘anaglyph 3D’).
The extent to which a 3D scene can be recreated depends on the number and spatial relation of the 2D images used. If the scene is captured in 2D images from a wider range of view-points, the 3D effect can be extended to a greater angle.
Even though algorithms for creating a 3D model from a set of 2D images exist, their use remains a niche activity as it often requires specialist equipment and/or extensive preparation. Creation of 3D models is particularly problematic when the scene is dynamic rather than static, e.g., where the scene contains a person. In such case, a set of 2D images needs to be captured from different angles relative to the scene within a relatively short time interval, requiring use of synchronized cameras.
A potentially attractive scenario is where a group of people, e.g., guests at a wedding, a birthday party, or the like, hereinafter also referred to as participants, each captures one 2D image, and these images are collated and used to create a 3D model of the captured scene.
There are two obstacles which hamper a quick and simple recreation of a, possibly non-static, 3D scene. Firstly, the 2D images need to either be selected after they have been captured, or their capturing needs to be coordinated, so that their respective time of capture occurs at approximately the same time, i.e., within a relatively short time interval. This requires accurate time synchronisation or triggering between different cameras, or capturing of a large number of images and selecting images which are appropriate for reconstructing a 3D model from the set of captured images.
Secondly, the selected images need to be transferred/exchanged between the participants' devices, which may be an issue where the participants do not know each other, do not wish to exchange contact information to facilitate sharing of images, or are not willing to spend time ‘pairing’ their camera with the cameras of other participants. For most situations, the level of coordination and networking of cameras which is required for creating a 3D model from a set of 2D images is overly complicated.