The present invention relates generally to audiovisual presentation methods and, more specifically, to a method for synchronizing computer-generated video displays with an independent audio source of unknown rate and commencement.
The market acceptance of low-cost personal computers with advanced graphics capabilities has resulted in the placement of versatile video display systems in millions of homes in the United States. Such systems typically offer a video signal generation system which includes redefinable character shapes, high resolution graphics, superimposed moveable objects, or a combination of these widely-recognized and well-understood techniques. For instance, the model VIC-20 computer manufactured by Commodore includes both redefinable character shapes and high-resolution color graphics. This model has sold over a million units in the United States and is currently available for less than $100.00. Other popular systems which include advanced graphics capabilities for home users include the Apple II family, the Radio Shack Color Computer family and the Atari family, all of which are widely available and have found broad acceptance.
These personal computers do not incorporate sound generation capabilities which can produce sounds commonly available from high-fidelity sound reproduction systems. Many of the computers identified above incorporate a tone generator with a frequency control and, in some cases, a selection of predetermined waveforms. However, these devices can only approximate the tone of a musical instrument, and fail to provide satisfactory imitation of a human voice. The ability to imitate multiple voices or instruments simultaneously is wholely absent. Some personal computers additionally include a digital-to-analog convertor which can be utilized to reproduce a digitized human voice, but the cost of random-access memory limits the vocabulary available through this method and prohibits simultaneous graphics presentations. External voice-generation modules are available at a cost which, in some cases, exceeds the cost of the personal computer itself. Thus external modules have found only limited market acceptance.
Because consumer audio equipment has found a very wide acceptance in the past several years, and because a large majority of personal computer owners also own consumer audio equipment, it has also been suggested that the audio portion of an audio visual presentation be provided from an independent audio source such as a tape or disk, and that the instigation and pace of computer-generated video be synchronized with the independent audio track. However, it is well-known that the transport speed of consumer tape and disk equipment, while constant for a given source of constant power, is not precisely the same as the standard speed for such equipment. It can be appreciated that a tape deck which operates at only two percent faster or slower than the standard speed will lead or lag behind a standard-speed video presentation by an interval in excess of two seconds after less than two minutes of operation. Thus it can be appreciated that the common technique of commencing audio presentation, and then manually commencing video presentation at the sound of a cue tone, does not provide sufficient capability for continuous synchronization of computer-generated video with an independent audio track.
Other systems have in the past presented computer-generated video as a sequence of still images, with a cue tone provided at intervals in the audio track to prompt manual advancement of the video images. This is analogous to the well-known method of operating photographic slide projectors, and incorporates similar deficiencies of continuous manual operation and lack of the illusion of animation.
Atari had provided a stereo tape drive for its model 400 and 800 personal computers. On one track of this tape, voice and other human understandable audio could be recorded in synchrony with timing pulses, software or data information on a second track. The tape drive was coupled to the computer so that the computer supplied regulated power to the recorder, and sensed the data track. The sound track was coupled through the radio-frequency modulator of the computer system so that a television receiver could receive both the computer visual presentation and the independent audio track. This system essentially permitted elimination of the human in the loop but required a purchase of the Atari recorder unit, which duplicated the function and capital commitment represented by consumer audio gear already in the hands of the computer owner in most cases.
Other distinct problems arise where the visual presentation is made through means other than a personal computer. In the field of motion picture projection, many methods have been set forth for controlling the film transport and shutter mechanism to coordinate an independent sound source with a film projector. Where the image source is a television camera system, a variety of methods have been set forth for delaying an independent audio track to match video delayed through a frame storage unit or other means for synchronization with other video sources.
In this latter category, one example is U.S. Pat. No. 4,218,705 which discloses an audio synchronization system in a television relay arrangement along a line of television stations. Each station delays the video signal by a time less than or equal to the known frame interval and sufficient to synchronize the same with its own internal video operations, and thereafter relays the delayed signal to the next station in the chain. While the delay at an individual station may be less than or equal to the known interval, adjustment of an audio delay interval to equal a multiple of the known interval nearest to the cumulative video delay provides an error in synchronization which as inperceptable to the viewer. A similar system is revealed in U.S. Pat. No. 4,313,135 in which the input video arrives in synchrony with input audio at a station in the chain. A video processing device delays the output video by a fluctuating delay interval at the station. The system compares input video and output video to determine on a continuing basis the fluctuating delay interval, and then varies the performance of an audio delay system to delay output audio by an equal interval. Thus output video and output audio remain synchronized.
U.S. Pat. No. 4,040,088 discloses a system for audio-visual presentations based on a single audio track carrying both audio and video information. While the audio information is a typical analog recording, the video information comprises a plurality of message packets, each of which has a recognizable identifier preamble, a screen address and a character shape or code recorded in binary frequency shift key form. The length and pattern of the preamble is chosen so that it exceeds the maximum likely length of any similar tone in the analog audio information. A preamble detector recognizes each incoming message packet and decouples a audio amplifier from the audio information until the message packet reception is completed. A memory mapped video graphic subsystem is updated according to each received message packet, so that information comprising text or graphics can be displayed in synchrony with accompanying audio information. In a single-channel simultaneous signal mode, the system provides a demultiplexer which separates the message packets from audio information so that the audio amplifier may remain constantly coupled and the resulting presentation is more closely synchronized. However, this system is clearly limited by the bandwidth of audio gear so that normal screen image updating is too slow for detailed animation of the kind contemplated by the present invention.
All of these prior art systems pose a serious financial disadvantage for the typical consumer who already owns both audio gear and a personal computer, because all of the prior art require the acquisition of additional hardware at substantial expense. In many instances, the prior art deals only with correction of an interval delay, while presuming that the audio signal speed is precisely synchronized with an independent video frame rate. As stated above, this presumption does not hold for consumer audio equipment which is already in the hands of personal computer users. Therefore, it can readily be appreciated that there exists a need for an alternative low-cost method of synchronizing computer-generated video displays with an independent audio track of unknown speed and commencement.