The SDI (Serial Digital Interface) format is standardized by SMPTE-259M of the SMPTE (Society of Motion Picture and Television Engineers) which issues standards concerning television engineering and video engineering. This SDI format is fundamentally a signal standard for the D-1 format or the D-2 format which may be a digital signal standard.
This SDI format is able to effect the transmission for only data of limited media. To be concrete, media that can be transmitted may be one channel of video data and about 8 channels of baseband audio data. For this reason, the SDI format is unsuitable for multimedia or multichannel.
Also, the SDTI (Serial Data Transport Interface) format is standardized by SMPTE-305M of the SMPTE. This SDTI format is suitable for multimedia or multichannel while the advantage of the SDI format is being effectively utilized and a common property with the SDI format is being kept in part. This SDTI format is the standard for transmitting the baseband signal and is able to transmit an end synchronizing code (EAV: End of Active Video) and a start synchronizing code (SAV: Start of Active Video) together.
That is, according to the SDTI format, there can be transmitted a serial digital transfer interface transmission packet in which an interval of each line of a video frame comprises an area into which an EAV is inserted, an ancillary data area into which ancillary data is inserted (ancillary data portion ANC), an area into which an SAV is inserted and a payload area into which video data and audio data are inserted.
When the above-mentioned SDTI format is in use, the editing of a stream may be realized by switching. In this case, the NTSC system may use the 10th line as the switching timing, and the PAL system may use the 6th line as the switching timing. The switching may be executed at the timing using video data mainly. As a result, there is the risk that audio data will cause an audio gap due to the following reasons.
For example, when audio data having a sampling frequency of 48 kHz is equally divided into each frame of the NTSC system video signal having 525 lines/frame and 59.94 fields/second, the number of audio data samples per one video frame becomes 1601.6 samples (=(48 kHz/59.94 fields)×2 fields) and does not become an integer. Therefore, when each sample of audio data is block-divided in correspondence with each video frame, 1601 samples or 1602 samples may be allocated to each video frame.
In this case, since the number of audio data samples of 5 video frames becomes 8008 samples (=5×1601.6 samples), the above-mentioned block division may be executed at the 5-frame sequence as shown in FIG. 22A. That is, the 5 frames comprising the first frame of 1602 samples, the second frame of 1601 samples, the third frame of 1602 samples, the fourth frame of 1601 samples and the fifth frame of 1602 samples may be repeated.
When the audio data is block-divided at the 5-frame sequence as described above, if the editing is executed by switching the stream as described above, there is then the risk that a continuity of 5-frame sequence will be lost. For example, if a 5-frame sequence of a stream STMa is presented as shown in FIG. 22A and is used as a reference phase, then as a phase pattern of a stream STMb which may be replaced with the stream STMa by switching, there may be considered five kinds of phase patterns shown in FIGS. 22B to F.
When the phase pattern of the stream STMb is presented as shown in FIG. 22B, the continuity of the 5-frame sequence can be prevented from being lost even though the switching is executed. However, when the phase patterns of the stream STMb are presented as shown in FIGS. 22C to F, the continuity of the 5-frame sequence will be lost. Depending on the switching timing, there occurs an interval (audio gap) in which frames of 1601 samples may continue to cause one audio data sample to become insufficient.
When this audio gap is reproduced as it is, an audio waveform becomes discontinuous. There is then the risk that a large noise that cannot be expected will occur. Therefore, an editing point at which the audio gap will occur should be detected, and sounds should be muted upon reproduction.
An object of this invention is to enable the receiving side to easily execute a processing such as muting sounds necessary for an editing point. Also, an object of this invention is to enable the stream switching point generated in a transmission line to be easily detected as an editing point.