Although video games and video game consoles are prevalent in many homes, game controllers, with their myriad of buttons and joysticks, are still intimidating and confusing to people that do not often play video games. For these people, using a game controller to interact with the game is an obstacle to enjoying it. Also, where the game is a dance game, often an additional controller is required in the form of a dance mat or dance pad. These dance mats have specific input sections (similar to buttons on a traditional controller) that react to pressure from the user's feet. But these mats take up a lot of space and are often single use controllers—they are used just for dance games and must be rolled up and stored when not in use.
To increase a user's feeling of immersion in the game, as well as to overcome the cumbersome nature of game controllers or dance mats for users not familiar with them, some game platforms forego the use of traditional controllers and utilize cameras instead. The cameras detect a user's physical movements, e.g., the waving of his arm or leg, and then interpret those movements as input to the video game. This allows the user to use a more natural-feeling input mechanism he is already familiar with, namely the movement of his body, and removes the barrier-to-entry caused by the many-buttoned controller.
One example of a camera-based controller is the EyeToy camera developed by Logitech and used with the Sony PlayStation 2 game console. The EyeToy, and similar cameras, typically include a camera and a microphone. The EyeToy sends a 640×480 pixel video stream to the PlayStation, and the game executing on the PlayStation parses the frames of the video, e.g., calculating gradations of color between pixels in the frame, to determine what in the camera's field-of-view is the user (“player”) and what is the background (“not player”). Then, differences in the stream over time are used to determine and recognize the user's movements, which in turn drive the user's interaction with the game console.
Other cameras used by game platforms include the DreamEye for the Sega Dreamcast, The PlayStation Eye (a successor to the EyeToy) for Sony's PlayStation 3, and the Xbox Live Vision for Microsoft's Xbox 360. These cameras all provide a typical single-input camera that can stream video or take still photographs, and some, such as the PlayStation Eye, additionally provide a microphone for audio input.
Microsoft is currently developing a depth-aware camera system in the form of Project Natal. A Natal system provides an RGB camera, a depth sensor, a multi-array microphone, and software that processes the inputs from the camera, depth sensor, and microphone. Beneficially, the Natal software provides, based on the input, a three-dimensional skeleton that roughly maps to the user's body. Specifically, rather than just determining a difference between “player” and “not player” like prior game cameras, Natal determines what is the user's right hand, left hand, head, torso, right leg, and left leg. This skeleton is preserved as a user moves his body in the camera's field of view, allowing for the tracking of specific limbs. This skeleton framework, however, is the extent of what Natal provides. Namely, no user interface is provided by Natal, and users must still use a game controller to interact with a game or menu system.
Other systems, based on non-camera technologies, have also been developed that attempt to track a user's movements. For example, the Nintendo Wii provides players with an infrared transmitter “Wii remote” that the user holds in his hand. The Wii remote is used as pointing device and has a built-in accelerometer to track changes in the Wii remote's position. The Wii remote is often paired with a “nunchuk” (which also has an accelerometer) that is held in the player's other hand, allowing the Wii to, in a sense, track the movements—or at least changes in the movements—of the user's hands. Another technology based on a hand-held controller is sixense, which is demonstrated at http://www.sixense.com
High-end motion capture (“mocap”) systems have also been used to track a user's movements. Typically mocap systems involve the user wearing a body suit that has dozens of white spheres located at relevant locations. The mocap cameras detect these spheres and use them to infer positional information about the user's body. Mocap systems, however, are expensive and not practical for the average user.