The present invention relates to decoding video and audio data which are multiplexed after compression coding, and more particularly to a receiver/decoder for receiving video and audio data compression encoded by high efficiency coding means and decoding the received encoded data.
In order to reduce a transmission/record cost of a large amount of video data, the video data is compression encoded by high efficiency coding means which removes redundancy and thereafter the data is transmitted or recorded. An example of high efficiency coding means is the well known MPEG method standardized by ISO/SC29/WG11. In the MPEG method, an encoded data multiplexing method as well as the video and audio data encoding method are standardized as MPEG/Systems.
In the fields of broadcasting and communications, in recent years, redundancy of moving picture video data is removed to compress the video data which is then transmitted in the form of digital signals. For video data compression, discrete cosine transformation (DCT) and motion compensation prediction coding such as one stipulated in MPEG specifications are generally performed. With high compression ratio of video data, a plurality of broadcast programs can be transmitted in a multiplex manner over a single transmission channel. A program means a set of video data and its associated audio and/or text data.
Multiplexing of a plurality of programs according to the MPEG specifications is stipulated in ITU-T Rec. H.222.0, ISO/IEC13818-1, 1994 (E), xe2x80x9cInformation Technologyxe2x80x94Generic Coding of Moving Pictures and Associated Audioxe2x80x94Part 1: Systemsxe2x80x9d, pp. 9-21, wherein it is described that a transport stream (hereinafter abbreviated as TS) packet is used for such multiplexing which has a fixed length of 188 bytes. The structure of an apparatus called a set top box is shown in the block diagram of FIG. 2. This apparatus separates TS packets formed in accordance with the MPEG specifications and supplied from a broadcast station or the like, and supplies them to a video decoder and an audio decoder to output decoded video and audio signals. Conventional techniques will be described with reference to FIG. 2.
A tuner 1 selects data of one channel transmitted from a communications medium such as a CATV and a communications satellite, and supplies the selected channel data to a demodulator 2. The demodulator 2 demodulates the channel data which was channel encoded by QAM, QPSK, or the like, performs an error correction process by using redundancy codes, and supplies the processed data to a demultiplexer 3. The processed data is a bit stream data of the TS packet format which is shown in FIGS. 3A and 3B. The formats are classified into two types shown in FIGS. 3A and 3B according to the contents of a TS packet. The format shown in FIG. 3A is used for transmitting program elements, including video data, audio data, text data such as teletext, and other data. Each TS packet of 188 bytes is constituted by a transport stream header (abbreviated as TS header) and a payload containing elements described above. The TS header always contains a packet ID (abbreviated as PID) representative of the attribute of the TS packet, and sometimes contains a program clock reference (abbreviated as PCR) which is time information used for a decoder to recover a system clock used as a time reference when each element was encoded. The payload is part of a packetized elementary stream (PES). The PES packet is a variable length packet constituting an element unit determined by each element, the type of data storage media, and the like. The PES packet is constituted by data of each element and a PES header. The PES header includes a stream ID for describing the element contents, time stamp information (PTS) for describing a PES packet length and a time when the element is displayed, and other pieces of information. The element unit designated by PTS is called an access unit. For example, one picture is the unit for video data, and one frame is the unit for audio data. The format shown in FIG. 3B is used for program specific information (abbreviated as PSI) which is additional information used for the system control. The payload of the TS packet is part of PSI described in units of sections. Each section is constituted by a section header, PSI, and cyclic redundancy check (CRC) used for error correction. The section header indicates the attribute of the succeeding PSI and a section length. PSI is structured hierarchically and contains information necessary for the system control, such as program association table (PAT) describing program information (specifically, PID of PMT to be described later) contained in bit stream data transmitted as TS, and a program map table (PMT) describing a correspondence between the element and PID in each program. The demultiplexer 3 shown in FIG. 2 receives a TS packet, and supplies PSI data via a data bus to a system decode buffer allocated in a RAM 7, and video and audio data which are elements constituting a program selected by a user, respectively to a video decoder 8 and an audio decoder 10. The demultiplexer 3 samples time information from the header of the TS packet containing PCR, and supplies control signals to a clock generator 4 to recover the system clock. A microprocessor 12 decodes the contents of PSI data sent to the system decode buffer in RAM 7 and stores the results in a system decode buffer 73 in RAM 7 in the data format usable by system control software programs. Software programs are generally stored in the same RAM or in a dedicated ROM. The work area of software programs is generally developed in RAM 72. In response to an instruction entered by a user via a user interface unit 13, the microprocessor 12 supplies PID which is used for deriving the TS packet of a program, to the demultiplexer 3, by using the decoded PSI data, and also supplies a control signal which is used for selecting the program, to the tuner 1. The video decoder 8 and audio decoder 10 decode the video and audio data in cooperation with a video decode buffer 9 and an audio decode buffer 11. Data transmission speed on a transport channel is different from the bit rate used when each element was encoded, because of multiplexing of programs. Therefore, if data is supplied at the data transmission speed directly to the video decoder 8 and audio decoder 10, the video decode buffer 9 and audio decode buffer 11 may locally overflow or underflow and video and audio outputs are disturbed. It is therefore necessary to insert packet transport buffers 5 and 6 between the demultiplexer and decoders to convert the bit rates in accordance with the capacities of the decoder buffers, and thereafter to supply element data to the video decoder 8 and audio decoder 10. Multiplexing according to the MPEG specifications assume that a packet transport buffer having a capacity of 512 bytes per each element is provided.
The present inventors have found out problems of an increased number of system components and an increased cost to be caused by the provision of the packet transport buffers 5 and 6 independently from the system memory. Even if the packet transport buffers are implemented in the demultiplexer 3, the increased circuit scale and an increased cost thereof are inevitable.
The MPEG specifications standardize not only a video and audio encoding method but also a coded data multiplexing method, as the MPEG/Systems.
FIG. 24 shows the structure of a transport packet (hereinafter called a TS packet) adopted by the transport system which represents one multiplexing method of the MPEG/Systems.
Each TS packet is a fixed length packet of 188 bytes including a 4-byte header and a 184-byte payload. The 4-byte header is constituted by a sync byte, an error flag, a unit start flag, a scramble control flag, a priority flag, a set of PID data, an adaptation field control flag, and a cyclic counter. The contents of PID data identify the payload of the TS packet.
The payload is constituted by an adaptation field (hereinafter abbreviated as AF) designated by the adaptation field control flag and a payload having video and audio encoded data and multiplexing information data. AF is constituted by AF length data, a discontinuity flag, a random access flag, a priority flag, a PCR flag indicating a presence/absence of AF optional data, an OPCR flag, a splice point flag, a private data flag, and optional data designated by these flags. Of the optional data, PCR data is used for clock synchronization between the transmission and reception sides, and the reception side generates a reference clock of 27 MHz by using a phase locked loop.
FIG. 25 illustrates selection of a particular channel of encoded video and audio data from a bit stream (TS packets) with a plurality of programs (hereinafter called channels) multiplexed. A program association table (hereinafter abbreviated as PAT) is always transmitted by a TS packet having PID=xe2x80x9c0xe2x80x9d. PAT contains multiplexing information of a plurality of channels and supplies information on whether each channel program map table (hereinafter abbreviated as PMT) is transmitted by a TS packet having what PID value.
FIG. 25 shows an example of multiplexing of three channels. In order to select a j-th channel and obtain its PMT, the TS packet with PID=Mj is captured. This PMT has information that the video data is transmitted by the TS packet with PID=Aj and the audio data is transmitted by the TS packet with PID=Vj. Therefore, encoded video data can be received, by capturing the TS packet with PID=Vj and encoded audio data can be received by capturing the TS packet with PID=Aj.
The MPEG/Systems described above, however, define only syntax of TS packets and the like and does not determine how a reception apparatus is actually configured. In addition, it does not provide particular definition of multiplexing of value added services such as program guidance, which definition is relied upon arbitrary decision on the side of broadcasting companies or program vendors.
In conventional technique disclosed in JP-A-4-229464 (corresponding to EP-0460751A2), each packet has time information which is used on the decode side as a reference of various timings including a decode timing.
It is an object of the present invention to provide a decoder for compressed and multiplexed video and audio data, wherein packet landing buffers are allocated in a RAM used by a CPU for the system control to thereby reduce the number of components and lower the cost of components.
According to one aspect of the present invention, there is provided a decoder for compressed and multiplexed video and audio data for receiving a group of packets having a plurality of multiplexed packet sets each containing video data and associated audio data compressed by compression codes and packetized and outputting a set of video and audio signals, comprising: a video decoder for decoding the compressed video data; an audio decoder for decoding the compressed audio data; a first memory for sequentially storing the group of packets; a processor for executing a process in accordance with a stored program, the process including sequentially reading a packet from the first memory, filtering packets containing a set of particular video and audio data and a control packet containing attribute information of the group of packets, and supplying the particular video and audio data to the video and audio decoders; a second memory having a work area for at least the stored program; a third memory for storing the attribute information contained in the control packet; and an interface unit for transferring an external control signal to the processor, the external control signal being used for filtering a set of video and audio data, wherein the first to third memories are provided in the same memory element.
A storage device of RAM used by a processor (CPU) as a main memory stores operating system software and is required to have a capacity of several mega bits. It is therefore easy to allocate the memory of which capacity is required 512 bytes for a packet landing buffer for each data element in the RAM. Therefore, the number of components is not increased.
It is another object of the present invention to provide a receiver for compressed video and audio data capable of receiving a bit stream with a plurality of multiplexed channels, decoding the video and audio data, and also receiving value added service data flexibly.
According to another aspect of the invention achieving the above object, there is provided an apparatus comprising: data selecting means for selecting an encoded stream of one channel (a group of packets), multiplexing information data, and additional information data, from a multiplexed stream (packets); bus access means for supplying the selected encoded stream and the like to a bus; a microprocessor connected to the bus; a random access memory; a program memory; means for processing the additional information data; means for decoding the video data; and means for decoding the audio data.
The apparatus may further comprises: means for generating a clock signal in accordance with clock information in the encoded stream; packet sync supply means for supplying a packet sync signal of the encoded stream to the microprocessor; means for adding an error flag indicative of failure in correcting transmission errors to the encoded stream, and data storage media interface means.
The encoded stream of one channel and other data selected by the data selecting means are stored via the bus access means and the bus into the random access memory, data read from the random access memory is analyzed and discriminated, and via the bus the video data is supplied to the video data decoding means, the audio data is supplied to the audio data decoding means, the additional information data or processed additional information data is supplied to the additional information data processing means.
The video data decoding means expands and decodes the encoded video data and the audio data decoding means expands and decodes the encoded audio data to reproduce video and audio signals. The additional information data processing means allows, for example, a program guide, to be displayed on video data in an overlay manner.
The clock signal generating means supplies a clock signal to the video data decoding means, audio data decoding means, and microprocessor, to thereby facilitate transfer of data such as encoded data. The packet sync supply means facilitates recognition of packet sync by the microprocessor.
The error flag adding means allows the video data decoding means or the like to recognize a presence of any error without using a specific signal line. The data storage media interface means supplies the received multiplexed stream of one channel to data storage media so that video and audio data can be recorded and reproduced at a low bit rate.