In the recent past, a digital camera was only able to capture images. Recently, however, face-tracking technology has become a common feature of consumer digital cameras, e.g., U.S. Pat. Nos. 7,403,643, 7,460,695, 7,315.631 and U.S. application Ser. Nos. 12/063,089 and 12/479,593 are incorporated by reference. The most recent implementations feature hardware coded face tracking in an IP-core. Applications to date have been limited to optimizing the exposure and acquisition parameters of a final image. Yet there are many additional applications which have even greater potential to enrich the user experience. The rapid deployment of face tracking technology in cameras suggests that other more advanced face analysis techniques will soon become feasible and begin to offer even more sophisticated capabilities in such consumer devices.
The detailed analysis of facial expression is one such technique which can offer a wide range of new consumer applications for mobile and embedded devices. In the context of managing our personal image collections, it is useful to be able to sort images according to the people in those images. It would be even more useful if it were possible to determine their emotions and thus enable images to be further sorted and categorized according to the emotions of the subjects in an image.
Other consumer device applications also can gain from such capabilities. Many such devices now feature a camera facing the user, e.g. most mobile smartphones, and thus user-interfaces could respond directly to our facial expressions as set forth in certain embodiment of the present invention. In one example, an e-learning system could match its level of difficulty to the degree of puzzlement on the student's face, for example. In another example, a home health system could monitor the level of pain from an elderly person's facial expression. In further examples, other domains such as entertainment and computer gaming, automotive, or security can also benefit from such applications. As the underlying expression recognition technologies improve in accuracy, the range of applications will grow further.
Computer gaming has grown from its humble origins to become a global industry rivaling the movie industry in terms of scale and economic impact. The technology of gaming continues to improve and evolve at a very rapid pace both in terms of control interface and the graphical display of the gaming world. Today's user interfaces feature more sophisticated techniques for players to interact and play co-operatively with one another. It is possible to have real-time video and audio links between the real players so they can co-ordinate their group gameplay.
However the emphasis remains on the player being drawn into the artificial game world of the computer. There is still little scope for the conventional gaming environment to reach back to the players, sensing and empathizing with their moods and feelings. Given the sophistication of modern AI game engines, it is believed that this is a missed opportunity and that gaming engines can be advantageously evolve in accordance with embodiments described below to develop and provide methods to empathize with individual game players.
Y. Fu et al have presented a framework of multimodal human-machine or human-human interaction via real-time humanoid avatar communication for real-world mobile applications (see, e.g., Hao Tang; Yun Fu; Jilin Tu; Hasegawa-Johnson, M.; Huang, T. S., “Humanoid Audio-Visual Avatar With Emotive Text-to-Speech Synthesis,” Multimedia, IEEE Transactions on, vol. 10, no. 6, pp. 969-981. October 2008, incorporated by reference). Their application is based on a face detector and a face tracker. The face of the user is detected and the movement of the head is tracked detecting the different angles, sending these movements to the 3D avatar. This avatar is used for low-bit rate virtual communication. A drawback of this approach is that the shape of the avatar needs to be specified by the user and forward-backward movement of the user is not detected, so the avatar appears as a fixed-distance portrait in the display.