The explosive growth of the computer networks like the Internet has provided a convenient way for computer users to obtain from remote sites information in the form of text, graphics, and audio and video segments. A computer connected to the Internet or other computer network (e.g., internet) can also be utilized by a computer user to interact in real-time with other computer users connected to the computer network. For example, a computer user may participate in an animated computer video game with a remote computer user. Some of these animated computer video games utilize a computer-generated virtual three-dimensional (3D) environment in which the computer user controls animated virtual characters.
Also of increasing popularity are virtual 3D chat environments, in which computer users interact with each other through animated 3D virtual actors (sometimes called avatars) that are controlled by and represent the computer users. In this type of chat environment, for example, each computer user is provided a 3D display of a room in which the virtual actors or avatars are rendered according to which users are communicating with each other. The arrangement and positioning of the virtual actors provide the computer users with a 3D display indicating which computer users are communicating with each other. This type of graphical indication is not possible in a conventional chat environment that uses text as a communication interface.
This form of communication within a virtual 3D environment, while holding much promise, also has a number of associated problems. For example, users often have difficulty comprehending and navigating the virtual 3D environment, locating in the simulated environment virtual actors of computer users with whom they wish to communicate, and arranging their actors in such a way that all the users conversing together can see each other's actors.
These types of problems are similar to the problems that have been faced by cinematographers since filmmaking began a century ago. Over the years, filmmakers have developed conventions or rules of film that allow actions to be communicated comprehensibly and effectively. These rules of film, although rarely stated explicitly, are so common that they are taken for granted and well understood by audiences. These cinematography conventions or rules of film utilize camera positions, scene structure, and "inter-shot" consistency rules to convey cinematographic information. For example, audiences understand well a scene that begins with a high elevation view of a landscape and passes to a lower elevation view of the landscape dominated by a roadway and an automobile, and then a close-up of a person in an automobile.
Cinematography conventions or rules of film for controlling camera positions and scene structure have received relatively little attention in the computer graphics community. A number of computer animation systems attempt to apply some cinematographic principles to computer graphics in limited applications including: an animation planning system using off-line planning of didactic presentations to explain complex tasks; the creation of semi-autonomous actors who respond to natural language commands; and the assembling of short sequences of video clips from a library of video footage. However, these systems typically pay little or no attention to camera placement or inter-shot consistency rules (e.g., it would appear inconsistent for an actor who exits a scene to the left of a frame to re-enter it from the right).
Some interactive animation systems have been described for finding the best camera placement when interactive tasks are performed. But these systems neither attempt to create sequences of scenes, nor do they apply rules of cinematography in developing their specifications.
Automating cinematographic principles for a 3D virtual application executed by a computer presents difficult problems not found in conventional film making. While informal descriptions of various rules of cinematography are mentioned in a variety of texts, they typically have not been defined explicitly enough to be expressed in a formal language capable of execution by a computer. In addition, performing automatic image or "camera" control in real-time on a computer imposes constraints that are more difficult to overcome than those faced by human directors. Human directors typically work from a script that is agreed upon in advance and can edit the raw footage off-line at a later time. This is not possible for an interactive 3D computer application executed in real-time.
In accordance with the present invention, the problems with automating cinematographic principles are overcome. The present invention includes a virtual cinematography method for capturing or rendering events in virtual 3D environments in accordance with the automated cinematographic principles. The method includes accepting a description of events that have occurred within a specified time period (e.g., one computer clock tick). Events are typically in a selected form such as (subject, verb, object). For example, a (B, talk, A) event means that virtual actor B is talking to virtual actor A. The accepted events are interpreted to produce an appropriate camera specification which is used to view the virtual actors.
The method uses two main components: camera modules and cinematographic idioms. The camera modules are responsible for the low level geometric placement of specific cameras in a scene and for making subtle changes in the positions of virtual actor to best frame each camera shot. The cinematographic idioms describe the cinematographic logic used for combing camera shots into sequences (e.g., animation sequences). The camera modules and the cinematographic idioms are used together to create virtual films and animations.
The method is used to implement a real-time camera controller based on a finite state machine for automatic virtual cinematography, called a virtual cinematographic application module (VC). The VC is used in virtual reality and other interactive applications to improve upon the fixed point-of-view shots or ceiling mounted cameras that such applications typically employ. VC also helps improve "intelligent-agent" user interfaces by allowing the users to see themselves with an agent at camera positions that appear natural.
The foregoing and other features and advantages of the present invention will be more readily apparent from the following detailed description, which proceeds with reference to the accompanying drawings.