This application is a national stage entry of PCT/US98/16636, International Filing Date: Nov. 8, 1998.
The present invention relates to display systems for playing multi-media works, and more particularly, to a sound processing system that alters a sound track in response to the cropping of an image associated with the sound track.
Multi-media works consisting of still or moving images with narration, background sounds, and background music are becoming common. Such works may be found on the Internet or on CD-ROM. Systems for displaying motion pictures with sound on computers and other data processing systems are also common utilizing programs such as VIDEO FOR WINDOWS to reproduce the work on computers. Furthermore, 3-dimensional modeling of sound can be specified in VRML 2.0. In a VRML 2.0 compliant browser, the sound generated by the components of a scene are specified by providing separate sound tracks for each sound source together with the location of that sound source in the scene. The sound observed by a listener facing any direction at any position relative to the sound source can then be reproduced by combining the individual sound sources.
Unlike fixed display systems, computer-based display systems allow the viewer to crop, enlarge, and display a portion of a digital image, scroll the enlarged image, and display the enlarged image in another cropping frame. However, for either a still image or a moving image, prior art audio data processing systems do not alter the sound tracks in response to the alterations in the image being displayed. In general, the same sounds are reproduced independent of the cropping frame chosen by the user. VIDEO FOR WINDOWS does not provide the ability to crop the motion picture image and to display the cropped image on the screen. For that reason, a conventional AVI file, which is motion picture file used by VIDEO FOR WINDOWS, normally does not include data for controlling multiple audio streams in response to the position of a cropping frame in the motion picture image. Therefore, if the video stream is associated with multiple audio streams, a conventional program such as VIDEO FOR WINDOWS lacks the ability control the audio signals decoded from the multiple audio streams in response to the user defining the position of a cropping frame in the motion picture image.
While VRML 2.0 provides the data needed to generate a sound track corresponding to the point of view of the user thereby creating a 3-dimensional sound image that can be changed in response to cropping, etc., systems implementing VRML 2.0 do not alter the xe2x80x9csound imagexe2x80x9d in response to changes in the visual image. Furthermore, the sound model implemented by VRML 2.0 is customized to implement 3-dimensional sound effects, and is poorly suited for applications that process audio data linked to 2-dimensional images. Therefore, none of the existing programs can automatically control the audio to match the user""s definition of a cropping frame in the motion picture image.
Broadly, it is the object of the present invention to provide an improved audio processing system for use with multi-media works.
It is a further object of the present invention to provide an audio processing system that alters the audio playback in response to changes in the scene selected by the user.
These and other objects of the present invention will become apparent to those skilled in the art from the following detailed description of the invention and the accompanying drawings.
The present invention is a display system for performing a multi-media work that includes image data representing a still or moving image, and sound data associated with the image data. The system includes a display for displaying an image derived from the image data, an audio playback system for combining and playing first and second audio tracks linked to the image, and a pointing system for selecting a region of the image on the display in response to commands from a user of the display system. The system also includes a playback processor for altering the combination of the first and second audio tracks played by the audio playback system in response to the pointing system selecting a new region of the image. The playback processor also alters the display such that the portion of the image selected by the pointing system is centered in the display. In one embodiment of the invention, the first and second audio tracks include sound tracks to be mixed prior to playback. In this embodiment, the image includes data specifying gains to be used in the mixing for the sound tracks when the selected region of the display is centered at predetermined locations in the image. If the predetermined locations do not include the center of the selected region, the playback system interpolates the data for the predetermined locations to provide the gains to be used in mixing the sound tracks. In another embodiment of the invention, the multi-media work includes data for specifying images at multiple resolutions. In this embodiment, the pointing system further selects one of the resolutions in response to input from the user, The playback processor then alters the combination of the first and second audio tracks played be the audio playback system in response to both the selected region and the selected resolution.
The invention is also a method for operating a data processing system during the playback of a multi-media work comprising image data and sound data associated with the image data. In the method, an image derived from the image data is displayed. First and second audio tracks linked to the image are combined and placed. Data are received from a user selecting a region of the displaced image. In response to the received data, the selected region of the displayed image is displaced centered and the combination of the first and second audio tracks is altered.