1. Field of the Invention
The present invention relates generally to systems and user interfaces for interacting with three-dimensional content on mobile devices and is in the general technical fields of Human-Computer Interaction (HCI), mobile multimedia platforms, mobile displays, and mobile virtual worlds. In particular, it is in the fields of motion estimation and spatial and gestural interaction.
2. Description of the Related Art
The amount of three-dimensional content available on the Internet and in other contexts, such as in video games and medical imaging, is increasing at a rapid pace. Consumers are getting more accustomed to hearing about “3-D” in various contexts, such as movies, games, and online virtual cities. However, mobile devices have so far not adapted to enabling users to navigate through and interact with 3-D content in a significant way. Unlike in the desktop setting, where the user may have external controllers available such as mice, joysticks, or game controllers, mobile users still mostly use buttons and keys, both physical and virtual, to interact with 3D content.
In addition, today's mobile devices do not provide an immersive user experience with 3D content because their displays allow only for a limited field of view (FOV). This is due to the fact that display size is limited by the size of the device. E.g., the size of a non-projection display cannot be larger than the mobile device that contains the display. Therefore, existing solutions for mobile displays limit the immersive experience for the user. Furthermore, 3D content such as virtual worlds on mobile devices are difficult to navigate, and small screen mobile devices do not provide good awareness of the virtual surroundings.
Previously, there have been a number of approaches to detect the ego-motion speed of a mobile device (i.e., the motion speed of a mobile device itself relative to a fixed frame work, such as the world/environment around it, detected with sensors on the device itself). One method is to use a single imager (visual) or image sensor (e.g., a built-in camera on a cellphone) to detect the overall optic flow of the background scenery in real-time. However, this approach does not allow distinguishing easily between shifting or linear motion and rotational motion, since the optic flow fields of these types of motions may be very similar.
Another method to detect ego-motion uses inertial sensors. Although such sensors can distinguish rotation motion from shifting motions (by using both rotational and linear accelerometers), this approach does not allow for direct measurement of ego-motion speed, since the sensors measure acceleration (not speed), which is then used, together with elapsed time, to calculate ego-motion speed. This calculation is not very precise, particularly with slow motions and low accelerations—conditions typical in user interface applications with gestural and motion control, particularly on mobile devices.
Motion controlled games on cellphones with gravity/orientation sensors and imaging sensors, gaming and other 3D content browsing applications employ motion control, only use the sensor, not the imaging sensor. Applications that use motion control for gaming applications currently use only two degrees of movement for measuring motion speed (particularly, rotation speed along pitch and roll axes). Some systems may use an additional sensor such as, a digital compass, which enables measuring a third rotational degree of movement (measuring yaw orientation and possibly yaw motion speed). However, none of these systems can detect translational speed motions (e.g., linear motions, with no or little acceleration).
In this system, optic flow data (from imaging sensors) and data from orientation sensors (or other types of inertial sensors) is combined. Mukai and Ohnishi studied the recovery of 3D shape from an image sequence using a video camera and a gyro sensor. (T. Mukai and N. Ohnishi, “Object shape and camera motion recovery using sensor fusion of a video camera and a gyro sensor,” Information Fusion, vol. 1, no. 1, pp. 45-53, 2000). Since rotation and translation have similar effects on the image, leading to unreliable recovery, the orientation sensor output is used to discriminate both situations and improve the accuracy of the 3D shape recovery. However, this approach is limited by the following assumptions:                a. There has to be a rigid object that is fixed in the environment and has feature points that can be tracked.        b. The object's surface should be composed of polygons or close to that.        c. The translation and rotation are done mostly following X coordinates (X translation and yaw)        d. Camera is always pointing at the objectIn addition, this system focuses mostly on recovering the 3D image of the object.        
Ego-motion detection of a mobile device (its own motion relative to a fixed framework, e.g., the world around it) has been investigated in detail in the past, and a number of different approaches have been tried so far, many of them have been used to successfully detect only certain kinds of motions or degrees of movement. As a result, the user is limited to only one or two degrees of movements.
It would be desirable to have a device that can measure ego-motion speed accurately without having to add any hardware, such as sensors. The user should not have to wear or hold any other device or sensor in order to track the motion speed of the device the user is holding. It would be desirable to use these ego-motion speed measurements in the interaction method called position-dependent rendering, without the need to add any hardware, such as control mechanisms like a joystick.