The editing of films and video images, i.e., to rearrange action sequences, is well known. However, the movie and video cameras used to capture the images that are later edited do not store with those images any machine-understandable record of image and camera position. Accordingly, the edited films and videos permit one to view the images in only one predetermined order, determined by the editor. If some other ordering of the image presentation is desired, it must be achieved through a difficult manual editing process.
A computerized, interactive editing process is described in a doctoral thesis “Cognitive Space in the Interactive Movie Map: An Investigation of Spatial Learning in Virtual Environments”, by Robert Mohl, 1981, submitted at MIT. In a demonstration carried out using images recorded at Aspen, Colo., the viewer is permitted to select film clips taken by a camera that is arranged to simulate driving down a street. At each intersection, the viewer chooses to turn left, turn right, or to proceed straight ahead. The viewer thereby simulates driving around streets in Aspen, Colo.
In other fields, it is known to gather, along with images, information concerning the position of the camera. Governmental and private agencies use satellites and airplanes to record images of positionally referenced data, such as land features or clouds. Each image frame contains positional references to the image tilt or plane of the camera. Present methods commonly either constrain the orientation of the camera to a fixed position, i.e. up and down, or use features captured in the image frames to derive relative positions and orientations of successive images when combining the image frames to form a map or the like.
Devices are known which combine images by matching features common to each of two or more images, i.e. superimposing.
One aspect of the present invention is recording positional data along with images. A number of methods are known whereby one may locate an object and describe the position of an object relative to a positional reference. For example, a magnetic device is known which can determine its position and orientation within a known magnetic field. Satellite systems and radio signal triangulation can also be used to determine position precisely. Inertial position determination systems are also known and are widely used in inertial navigational systems.
An object of this invention is providing an image data gathering device which encodes positional and/or spatial information by capturing both camera position and camera orientation information along with image data. This information permits images to be joined or sequenced for viewing without the distortions that can result from attempting to match the edges of adjoining images together.
A further object of this invention is providing three-dimensional image reconstruction of objects using frames shot from different viewpoints and perspectives through the provision of a triangulation reference.
Still another object of this invention is providing a camera path map which allows images to be selected based upon the position and orientation of the camera from the map. For example, an operator cam learn the location of an object in a film clip, such as an escalator. Images of the escalator may then be quickly and automatically located by selecting other frames which point to that same escalator from different camera positions.
Another object of the invention is providing a compact and practical image and positional data recording system which uses commonly available equipment. A system having accelerometers mounted directly upon the recorder, eliminating the need for a restrained or gimballed platform, permits greater freedom of motion for the recording device as well as reduced cost and complexity.
Briefly described, the invention resides in a video camera that is integrated with a tracking data acquisition unit containing accelerometers and gimbal-mounted gyroscopes, and optionally a rangefinder. As the operator of the video camera moves about taking a motion picture of the environment, a microprocessor and logic associated with the accelerometers and gyroscopes senses all rotational motions of the camera by means of sensors associated with the gimbals and senses all translational motions of the camera by means of sensors associated with the accelerometers. And the rangefinder provides information to the microprocessor and logic concerning the distance from the camera to the subject being photographed.
From data presented by these sensors, the microprocessor and logic compute and generate a modulated audio signal that is encoded with a continuous record of acceleration in the X, Y and Z directions as well as with a continuous record of the pitch, roll, and yaw of the camera and of the distance to the subject. This audio tracking information data signal is recorded on the audio track of the same video tape upon which the video images are being recorded by the camera. In this manner, the video tape recording captures, along with the sequence of images, the tracking data from which the precise position of the camera, its precise orientation, and the position of the subject may later be computed.
Later on, the recorded audio tracking information data and video data is played back into a computer. Images are selected from the sequence of images and are retained, in compressed form, in a database. Each image is then linked to computed positional information that defines, for each image, the location and orientation of the camera and, optionally, the distance to the subject and the subject location. This positional information is derived through computation from the tracking information retrieved from the video tape audio track, as will be explained below.
Next, special computer programs can aid an individual using the computer in navigating through the images, using the positional information to organize the images in ways that make it easy for the user to browse through the images presented on the graphics screen. Several such programs are described below, and a complete description is presented of a movie mapper program which presents the user with a plan view and elevational views of the camera path plotted as a graph alongside views of selected images, with the path marked to show the position and orientation of the camera. The user, by clicking at any point on this path with a computer mouse, may instantly retrieve and view an image captured at the chosen point. Additionally, by clicking upon diamonds and arrows and the like displayed as overlays superimposed upon an image, the user may command the program to search for and find the nearest image which gives a view rotated slightly to the left or right or which maintains the same view but advances forward in the direction of the view or backward. One may also jump forward and turn simultaneously. A wider field of view may be assembled by assembling automatically chosen images and aligning them into a panorama. The user is thus enabled to navigate through the images in the manner of navigating a boat to the extent permitted by the nature and variety of the images in the data base.
Further objects and advantages are apparent in the drawings and in the detailed description which follows.