Traditionally, the filming and presentation of motion picture film occurs at 24 frames per second in the US and Canada and typically, 25 frames per second in Europe and Asia. Beginning with the release of Peter Jackson's “The Hobbit” by Warner Bros toward the end of 2012, digital cinema exhibitors will begin facing the likelihood of receiving digital cinema compositions (i.e., content) at variety of motion picture frame rates for example 48 fps or even 60 fps. The presentation rate of the corresponding sound in such higher frame rate digital cinema compositions, however, will likely not change. Presently, digital theater systems present sound (audio) at a rate of 48,000 or 96,000 samples per second. Typical digital theater sound processing systems exhibit a small audio latency, on the order of a few samples, and the latency remains either constant, or relatively so (i.e., the different between a two sample latency in a digital system at 48 KHz sample rate and at 96 KHz remains on the order of to be 1/48000 second).
Packet latencies in a digital network also vary relatively little and the typical audience member does not perceive such latencies because each audio samples exists somewhere between 1/2000 to 1/4000 the duration of a single frame of the digital cinema composition. However, most digital cinema projectors exist as signal processing pipelines. Picture information flows into the projector on a pixel-by-pixel basis for storage in a frame buffer. The pixels in the frame buffer undergo color correction, and/or scaling, and/or cropping according to the projector settings. Finally, such processing places the digital image into a form suitable for driving the imagers in the digital projector. The processing associated with such digital cinema projection incurs an artifact in that the latency (image processing) from the time the first pixel undergoes transmission to the projector to the time an image appears on-screen. However, because most modem projectors employ frame-buffered pipeline architectures for image processing, the projector latency (tPICTURE_LATENCY) generally varies with the rate at which frames are displayed. This is because, in a frame-buffered pipeline architecture, image data advances frame-by-frame through the stages of processing, progressing by one stage every frame interval, until the frame is displayed. In the general case, for an individual projector or model of projector, the values for tPICTURE_LATENCY at different frame rates could be measured empirically, but in the case of a frame-buffered pipeline architecture, tPICTURE_LATENCY can be represented by this equation:
                              t                      PICTURE            ⁢                                                  ⁢            _            ⁢                                                  ⁢            LATENCY                          =                              t            FIXED                    -                      FRAMES            FRAME_RATE                                              EQ        .                                  ⁢        1            where:
FRAME_RATE is the rate at which the frames are provided in frames per second (which is not always an integer);
FRAMES is the number of frame-buffered stages (e.g., two) in the projector's image processing architecture, each of which may each take up to a frame interval to complete, and,
tFIXED is a constant amount of time (e.g., −5 mS), which represents other transfer, processing, or setup time found in the architecture, that does not vary with frame rate and is not represented in the second term of EQ. 1. Herein, a convention is used so that latencies that increase picture delay are more negative, whereas latencies that increase audio delay are positive. This convention also applies in the comparative: Latencies that increase picture delay with respect to the audio are more negative, too. For the purpose of discussion herein and easy provision of plausible example latency values, and not by way of limitation, a projector architecture for which the latency is determinable with EQ. 1 is assumed. For other architectures, different equations (not shown) may apply, or empirical latency measurements can be made.
For the example case of a digital cinema projector having the example values of tFIXED (−5 mS) and FRAMES (2) above, at a rate of 24 frames per second, the projector will incur a latency of −0.005.−2/24 seconds, or −88⅓ mS. However, at 48 frames per second, the projector will incur a latency of −0.005−2/48 seconds, or −46⅔ mS; and, at 60 fps, the latency becomes −0.005−2/60=−38⅓ mS. (Recalling that from the convention above, a less negative value means the picture is less delayed).
The frame rate of a digital cinema composition is determined by metadata within the composition itself. If the image assets of a composition are labeled as being 24 frames per second, the presentation is made at 24 frames per second. If the assets are labeled as being 48 frames per second, the presentation is made at that frame rate. The frame rate is represented as metadata within the composition, and commercial digital cinema devices do not offer the option to alter that frame rate. In a case where a digital cinema device is incapable of playing a composition at the specified frame rate, the device will refuse to play the composition.
With respect to sound in a digital cinema theatre, there are two substantial sources of latency in the presentation of audio (tSOUND_LATENCY) that offset some or all of the projector's intrinsic latency 145, as represented by this equation:tSOUND_LATENCY=tAUDIO_DELAY+tTIME_OF_FLIGHT  EQ. 2:where:
tTIME_OF_FLIGHT is the time for sound to travel through the air from a speaker to a member of the audience member; and,
tAUDIO_DELAY is a property of a theatre sound processor, which may be constant or may be manually or automatically adjustable.
Delays caused by audio processing, audio signal transmission (which in some modem auditoriums is over a network), and amplification are small relative to these other values and may be either neglected, or considered as part of tAUDIO_DELAY.
With the speed of sound in air being about 1, 126 feet/second (with variations depending on temperature and humidity), a rough approximation (and slight underestimation) is that sound travels 1 ft. in 1 mS. After audio signals have been emitted from the speaker, the distance through the free air of the auditorium to the audience determines the tTIME_OF_FLIGHT portion of the sound latency, and is greatest for those seated furthest from the speaker.
The ideal sound and picture synchronization occurs when the picture latency and sound latency are exactly equal, producing a synchronization error of zero, according to this equation:tSYNC_ERROR=tSOUND_LATENCY+tPICTURE_LATENCY+tDIFFERENTIAL_DELAY  EQ. 3:which expands by EQ. 2 to:tSYNC_ERROR==tAUDIO_DELAY+tTIME_OF_FLIGHT+tPICTURE_LATENCY+tDIFFERENTIAL_DELAY  EQ. 4:where:
tSYNC_ERROR represents perfect sound and picture synchronization when zero and, per the convention described above, is negative to represent the picture being late with respect to the sound, and positive to represent the sound being late with respect to the picture; and,
tDIFFERENTIAL_DELAY (also herein called “differential latency” and abbreviated as “∂L”) is a setting available on some digital cinema media blocks, which allows a technician to modify the relative delay between when picture is sent to a projector and audio output to the theatre audio equipment. Such digital cinema media blocks are supplied with a factory default value for tDIFFERENTIAL_DELAY, which varies by manufacturer. Some manufacturers deliver systems with tDIFFERENTIAL_DELAY set to zero, such that a technician must manually set tAUDIO_DELAY to achieve synchronization by offsetting both tPICTURE_LATENCY and tTIME_OF_FLIGHT. Other manufacturers provide a factory default for tDIFFERENTIAL_DELAY that approximately offsets tPICTURE_LATENCY and a typical tTIME_OF_FLIGHT, so that tAUDIO_DELAY can be approximately zero, except for larger than typical auditoriums.
In present-day cinemas, whether digital or film-based, there are two manual controls: One adjusts tAUDIO_DELAY and, if available, is a technician's setting generally not made by the everyday operator, the other adjusts tDIFFERENTIAL_DELAY and represents different technologies in film and digital cinema equipment. For film projectors, tDIFFERENTIAL_DELAY is adjusted by physically repositioning the optical soundtrack reader to be closer to or further from the film gate, along the film path, that is, moving the optical soundtrack reader closer to the film gate, tDIFFERENTIAL_DELAY is made more positive (i.e., the sound is read later). For decades, the film's audio did not go through an external processor, so this setting was the only one available for offsetting tTIME_OF_FLIGHT (i.e., tAUDIO_DELAY and tPICTURE_LATENCY were both essentially zero). For digital cinema media blocks, tDIFFERENTIAL_DELAY is a configuration available to a technician, but again, generally not made available to the everyday operator. In digital cinema installations, both tDIFFERENTIAL_DELAY and tAUDIO_DELAY must be correctly set to offset tPICTURE_LATENCY and tTIME_OF_FLIGHT.
From the above, it is clear that a present-day (prior art) digital cinema auditorium system having a projector with the example properties described above and adjusted to present synchronized picture and sound at 24 fps (so that tSYNC_ERROR=0), when called on to present a digital cinema composition at 48 fps, will have a tSYNC_ERROR of tPICTURE_LATENCY@24−tPICTURE_LATENCY@48 is (−88⅓-−46⅔)=−41⅔ mS and display the picture 41⅔ mS sooner than the sound, or about two frames early (i.e., the sound will seem to be two frames late). Such error in the synchronization, in the opinion of some, is not drastic. As a constant synchronization error, human audience members will accommodate such an offset. However, when a digital cinema system executes a playlist with mixed frame rate digital cinema compositions, the tSYNC_ERROR will change among compositions having different frame rates. The frame rate variation will highlight the differences in picture/sound synchronization, making it apparent and annoying to even non-expert audiences.
Historically, virtually all digital cinema compositions have existed at the same frame rate (24 frames per second). Few, if any, content providers had provided digital cinema compositions at other frame rates, so few, if any, exhibitors have compiled different frame rate digital cinema compositions in the same playlist. To the extent that the problem existed at all, exhibitors have ignored this problem.
Thus, a need exists for a technique for synchronizing sound (audio) information to picture (video) information when the picture information has different frame rates.