In many computing applications, a user manipulates or controls an application or game using specific user input hardware devices. Examples, of such hardware devices include game controllers, remote controls, keyboards and mice. Such controls can be difficult to learn and hence create a barrier to adoption of the application or game. An example of this is a computer game which is controlled by a game controller. To play the game successfully, the user first has to learn how the manipulation of the game controller relates to the control of the game (e.g. which button controls which aspect of an on-screen character). This initial learning period may be sufficient to dissuade a user from playing the game. Furthermore, the movements used to operate an input device generally do not correlate closely to the resulting action in the game or application. For example, the movement of a joystick or pressing of a button does not correspond closely to the movement of a bat or racket in a game environment.
Motion-based controller devices can be used to more accurately reflect the movement of the user in the application or game. However, hardware input devices are still operated by the user in such systems (e.g. held, pointed or swung). Camera-based user input does not use input devices. Rather a camera captures images of the user and interprets these as input gestures or movements. However, camera-based user input produces a large amount of image data, which needs to be processed in real-time to accurately control a game or application. For example, the captured camera images should be segmented in real-time so that a user in the foreground of camera image is separated from any surrounding background, enabling the user's gestures and pose to be analyzed.
The embodiments described below are not limited to implementations which solve any or all of the disadvantages of known camera-based user input techniques.