The present invention relates to electronic transformation of images. More particularly, the present invention is directed to electronic transformations of images in response to time varying signals, such as audio signals.
Audio-visual entertainment is one of the most enduring forms of entertainment dating back to the days of yore when history was passed down through a combination of song and dance. Modern television and motion pictures are the progeny of the early days of song and dance, using a combination of audio information and video information to provide an entertaining experience for audiences world-wide. Traditionally, the video information and the audio information were created independent of each other. In the shadows of the motion picture industry evolved a profession in which sound-tracks were tailored for a pre-existing motion picture. In the early 1980""s, this process was reversed when recording companies introduced a new marketing technique, the rock-video. In the rock-video, a short story, or collage of visual effects, would be associated with a pre-existing sound-track. In both of the aforementioned audio-video products, the audio and video information is synchronized to maximize the enjoyment of the audio-visual experience, requiring a great deal of human labor.
With the prevalence of the personal computer, audio-visual entertainment has been revolutionized with the introduction of interactive games, MPEG algorithms, and MP-3 algorithms and the like. More recently, the flexibility provided by the personal computer in creating audio-visual entertainment has been enhanced with development of computationally efficient algorithms, such as geometric transformation algorithms described in U.S. Pat. No. 5,204,944 to Wolberg et al. These algorithms have facilitated computer-generated animations that employ image transformations to enhance the enjoyment of personal computing as a portion of an interactive game or as a stand-alone application. Image transformations involve varying the visual representation of a two-dimensional (2-D) image using either 2-D or 3-D techniques. Transformations associated with 2-D images include image translation, scaling and rotation, and transformations associated with three-dimensional images include the aforementioned transformations, as well as bending, twisting and other more complicated modifications.
More recently, image transformations have been described as being desirable to synchronize with music in an automated fashion. To that end, a system was developed to generate movies in non-real-time. As a first step, a piece of music is analyzed in non-real-time by a computer program to extract a constant tempo and the energy of the associated beats within multiple frequency bands. Control signals are then triggered at these computed beat-times, and the amplitude of a control signal is proportional to the energy in the associated frequency band. These control signals vary smoothly over time and are used as inputs to a rendering program which varies the geometry of a pre-defined graphics scene. These geometry variations include changing an object""s position, applying deformations to an object""s surface, moving the camera viewpoint, and changing the lighting. For each frame of the animation, the rendering program generates a single image which is stored on the computer. After all of the images are generated, they are combined with the original piece of music into a movie. The final result is an animation sequence that is synchronized to the music in a smooth and visually appealing manner. The system does not, however, run in real-time, and it is not interactive.
Various programs (Winamp visualization plugins, for example) have been developed that respond to audio in real-time to deform an animation sequence. At a broad level, these programs perform the steps of: mapping different frequency ranges of the audio to different parts of the graphics scene and moving parts of the graphics scene in response to this audio. Real-time animation is achieved by a processing loop of:
1. audio feature extraction
2. geometry computation
3. geometry rendering
performed individually for each animation frame.
However, if such a mapping is done directly, without any conditioning of these time-varying energy signals, the visual results are not perceptually very pleasing. Additionally, because the auditory feature extraction and geometry computation must be performed for each frame, the audio and video outputs are not synchronized.
Accordingly, improved audio-driven graphics techniques are continually being developed.
Provided is an improved method and system to drive transformations of a visual representation, in real-time, with characteristics extracted from an audio signal. In this manner, transformations of the visual representations are displayed concurrently with extraction of the audio signals characteristics, which facilitates pipeline processing and user interaction with the processing. According to one aspect of the invention, the method includes extracting characteristics of the audio signal, varying the representation in response to the characteristics, defining a modified representation; and periodically, providing a visual display of the modified representation synchronized to the audio signal, while extracting characteristics of the audio signal.
The system includes a processor, a buffer, in data communication with the processor, with the buffer holding digital data representing an audio signal, and a memory, in data communication with both the buffer and the processor, the memory storing a program to be operated on by the processor, the program including information corresponding to an object, a first process for capturing a frequency domain representation of a time segment of the audio signal, extracting characteristics therefrom, and forming a conditioned control signal based on extracted characteristics, and a second process to vary parameters associated with the object in response to the conditioned control signal while capturing an additional time segment of the audio signal, with the second process defining a modified representation having deformation magnitudes controlled by the conditioned control signal.
According to another aspect of the invention, the characteristics extracted from the audio signal yield information often correlated with the musical beat, including the beat energy, i.e., amplitude, and delay between beats. A smoothed conditioned control signal respond quickly to an increase in magnitude of the energy characteristic of the audio signal. However, if the magnitude of the energy characteristic then decreases suddenly the smoothed conditioned signal decreases slowly so that a controlled deformation appears more pleasing to a viewer. Specifically, one or more functions are stored in the aforementioned memory and remap vertices corresponding to the one or more objects in response to the magnitude and timing of the conditioned control signal, thereby varying the visual representation of the one or more objects. Characteristics other than the beats of the audio signals may be extracted, such as phase information, timbre information, frequency information and the like.
According to a further aspect, the system may include an audio system and a display. The sound is generated from a delayed audio signal, in this manner, the transformations of the objects are displayed in synchronization with the sound corresponding to the portion of the audio signal from which the characteristics were extracted to create the aforementioned transformations.
According to an additional aspect of the invention, an animation function control signal is triggered when the conditioned control signal increases rapidly in magnitude and is set equal to the output of a function generator when the conditioned control signal begins decreasing in magnitude.
According to a further aspect, if a minimum time period has not expired before the conditioned control signal again increases in magnitude a new animation control signal is not triggered.
Other features and advantages of the invention will be apparent in view of the following detailed description and appended drawings.