The subject matter discussed in the background section should not be assumed to be prior art merely as a result of its mention in the background section. Similarly, a problem mentioned in the background section or associated with the subject matter of the background section should not be assumed to have been previously recognized in the prior art. The subject matter in the background section merely represents different approaches, which in and of themselves may also be inventions.
Present digital cinema servers send compressed streams of video data in a defined format (e.g., JPEG 2000 video) to a media block along with multiple channels of digitized audio, for example 16 channels of PCM (pulse-code modulated) audio at a 48 kHZ sample rate. The audio content is a packetized stream that may have different formats depending on the vendor of the cinema system. The audio and video signals may be encrypted prior to being input to the media block. The media block decrypts the JPEG video into an uncompressed baseband signal, and transmits the audio to a cinema processor to be conditioned for the playback environment. The cinema processor performs functions such as equalization for the playback environment and routes the audio signals to the appropriate speakers in a surround sound array based on speaker channel labels provided in the in audio content. The ultimate output comprises a video feed that goes out in HD-SDI (high definition serial digital interface) format to a projector, and analog audio is sent to the amplifiers and speakers. For proper playback, the audio tracks must be properly synchronized to the video content.
In general, A/V synchronization is not particularly precise in theater environments and theater technicians generally do not measure A/V synchronization today during installation/calibration. Film A/V synchronization is said to be accurate to within 1.5 frames (63 ms @24 fps). Since sound travels at about 1 ft/ms, A/V synchronization can vary by up to 50 ms depending on the location of the listener in the theater. In present cinema systems the timing of the audio and video signals is well known so that audio and video are normally synchronized. The latencies of well-established components, such as processors and projectors are also well known, for example, projector latency is typically specified at around two frames or 88 ms, so that the cinema server can usually be programmed to accommodate different timing characteristics to ensure proper synchronization. In typical applications, the media block has two real-time components, the HD-SDI interface and an AAS (audio amplifier system) interface. These are real time interfaces and can be configured to provide A/V output that is synchronized or programmed with some delay as appropriate. Thus, despite a certain amount of imprecision in present systems, the timing between the audio and video content is fixed, so that when a digital audio sample is sent to the cinema processor, it will be followed by a fairly precise interval (e.g., 1/24 second later) by an analog audio signal sent to the amplifiers.
A new adaptive audio processor and object-based audio format has been developed that allows audio to be transmitted over a side-band Ethernet connection. This Ethernet connection provides a high-bandwidth conduit to transmit multiple complex audio signals. Assuming that the bandwidth of a single channel of digital audio is 1.5 megabits/sec. (Mbps), the bandwidth for a present 16-channel system (e.g., AES8) is on the order of 24 Mbits/sec. (16×1.5 Mbits/sec.). In contrast, the bandwidth of an Ethernet connection in this application is on the order of 150 Mbits/sec., which allows up to 128 discrete complex audio signals. This adaptive audio system sends audio content from a RAID array (or similar storage element) in non real-time over Ethernet from a digital cinema server to an adaptive audio cinema processor. Ethernet is a bursty, non-real time and non-deterministic transmission medium. Thus, the inherent audio/video synchronization feature of present cinema processing systems is not applicable to this type of adaptive audio system. The audio that is provided via Ethernet must be synchronized to the video through an explicit synchronization function. To align the audio content, delivered via Ethernet, to the video signal, there must be a deterministic latency to properly synchronize the audio and video content.