Many computing applications such as computer games, multimedia applications, or the like use controls to allow users to manipulate game characters or other aspects of an application. Typically such controls are input using, for example, controllers, remotes, keyboards, mice, or the like. Unfortunately, such controls can be difficult to learn, thus creating a barrier between a user and such games or applications. Furthermore, such controls may be different than actual game actions or other application actions for which the controls are used, thus, reducing the experience for the user. For example, a game controller that causes a game character to swing a baseball bat may not correspond to an actual motion of swinging the baseball bat.
One solution is to use a video game system that tracks motion of a user or other objects in a scene using visual and/or depth images. The tracked motion is then used to update an application. Therefore, a user can manipulate game characters or other aspects of the application by using movement of the user's body and/or objects around the user, rather than (or in addition to) using controllers, remotes, keyboards, mice, or the like. One challenge with such a system is to keep track of who is playing the game (or otherwise interacting with the application) as users move in, out and back into the field of view of the system.