A LAN 5 is illustrated in FIG. 1. The LAN 5 includes the segment 10. The LAN segment 10 comprises a shared transmission medium 12. A plurality of stations 14-1, 14-2, 14-3, 14-4, 14-5, . . . , 14-N are connected to the transmission medium 12. Illustratively, the LAN segment 10 is an Ethernet segment. A hub 20 (repeater) may also be connected to the transmission medium 12. The hub 20 connects the LAN segment 10 to other LAN segments 22, 24 and their associated stations 23, 25. The LAN segment 10 is connected to a Wide Area Network via the gateway 30. Note that one of the stations 14-1 serves as a source of a live audio/video multicast. A subset of the stations 14-2, 14-3, 14-2, . . . , 14-N receive the multicast.
Illustratively, a multicast is a communication in which data is broadcast from a source station to a plurality of receiving stations. However, each of the receiving stations individually decides if it wants to participate in the multicast.
It should be noted that, while in FIG. 1, the source of the live multicast is shown as being part of the network segment 10, this is not necessarily the case. The source of the live multicast may be a station in another segment 22, 24 and the multicast may be transmitted into the segment 10 via the hub 20. The source of the live multicast may also be entirely outside the LAN 5 and may be transmitted into the segment 10 via the gateway 30.
The source station 14-1 is shown in greater detail in FIG. 2. The source station 14-1 comprises a bus 40. Connected to the bus 40 are a CPU 42 and a main memory 44. Also connected to the bus 40 is a live audio/video interface 45 and a live audio/video source 46. The live audio/video source 46 may be a video camera for generating audio and video signals. The audio/video source 45 may also be an antenna for receiving a terrestrial audio/video broadcast signal or a satellite receiver for receiving an audio/video signal from a satellite. The audio/video interface 45 digitizes and compresses the received audio and video signals. The audio data is compressed using a conventional voice compression algorithm such as Pulse Code Modulation (PCM), to generate an audio data bit stream with a bit rate of 16-64 k bits/sec. The video data is compressed using a conventional digital compression algorithm such as motion JPEG or Indio. Thus, the audio is encoded at a constant rate while the number of bits used to encode each frame of video varies. The compressed audio and video data is transmitted via the bus 40 from the interface 45 to the memory 44 for temporary storage. Under the control of software executed by the CPU 42, the compressed audio and video data is organized into messages. Illustratively, each video message contains data for one frame of video, each audio message contains audio data associated with a fixed number of video frames. The CPU 42 then fragments the messages into packets of an appropriate size for transmission via the network 5.
A LAN interface 50 is connected to the bus 40. The LAN interface 50, which includes physical processing and the so-called Media Access Controller or MAC, interfaces the source station 14-1 to the transmission medium 12 (see FIG. 1). The LAN interface 50 receives from the memory 44 packets containing audio data and packets containing video data belonging to the live multicast. The LAN interface 50 performs physical layer processing on each packet and transmits the packets via the transmission medium according to the media access protocol. The audio and video packets are transmitted separately and independently. The source station 14-1 also comprises a display interface 60 for interfacing the bus 40 to a display system 62. The display interface 60 is described in greater detail below. The display system 62 includes a speaker for reconverting the audio signal portion of the multicast back into sound and a visual display for displaying the video portion of the multicast. The messages containing audio and video data may be transmitted from the memory 44 to the display interface 60 for display using the system 62.
One of the multicast receiving stations 14-N is shown in greater detail in FIG. 3. The station 14-N comprises a bus 80. Connected to the bus 80 are a CPU 82 and a main memory 84. The station 14-N also includes a LAN interface 86 connected to the bus 80. A display interface 60 is connected to the bus 80 and the display 62. If the station 14-N is a member of the live multicast, the packets transmitted from the source station 14-1 are received from the transmission medium at the LAN interface 86. The packets undergo physical processing in the LAN interface 86 and are transferred to the main memory 84 for temporary storage. The packets are then combined into the audio and video messages. These messages are transmitted to the display interface 60. In the display interface 60, the audio and video data are decompressed using a video decoder and audio decoder and transmitted to the display system 62 for display of the video and playing of the audio.
Techniques for establishing a multicast in a LAN environment and techniques for enabling a particular receiving station to join a multicast in a LAN are disclosed in "MULTICAST ROUTING TECHNIQUE", Ser. No. 08/417,067, filed Apr. 4, 1995 by Joseph M. Gang, Jr. This application is assigned to the assigned to the assignee hereof and is incorporated herein by reference.
It should be noted that the multicast source station operates in the "push mode". This means that there is no feedback from the receiver stations to the source station for flow control as there might be in the case in a point-to-point communication involving a single source and a single destination.
Due to the fact that the source station's audio encoder and the receiving station's audio decoder are not using the same clock, the audio data "pushed out" by the source station can be faster or slower than the receiving station can play.
Accordingly, it is an object of the invention to provide a method to maintain the receiver audio decoder and source audio encoder in synchronism.
In a live audio/video multicast in a LAN environment, wherein the source is operating in the "push mode", there is no guarantee of data delivery. In addition, the audio and video packets are received separately and independently at each receiver station. In this environment, it is important to synchronize the audio and video data received at each receiving station. This is important for achieving "lip sync", i.e., a synchronization between the movement of a person's lips seen on the video display and the words played by the speaker.
Accordingly, it is a further object of the invention to provide a method to synchronize live multicast audio and video data received at a participating receiving station in a LAN.