1. Field of the Invention
The present invention relates to a signal processing apparatus, and more particularly to an apparatus for transmitting image data and audio data after storing temporarily the image data and the audio data in a memory.
2. Related Background Art
Recently, the development of a computer interface for connecting a personal computer (hereinafter referred to as a PC) and peripheral equipment has advanced, and Universal Serial Bus (USB), Institute of Electrical and Electronic Engineers (IEEE) 1394 and the like have been frequently used as typical bus standards.
These computer interfaces are used for transferring digital data of a still image, a moving image and the like, all recorded by a digital camera or a digital video camera into a storing medium such as a memory card. Moreover, the above-mentioned interfaces have lately begun to be used for streaming, in which images and sound from a storing medium such as a charge coupled device (CCD) or a tape are reproduced while being transferred to a PC side, in addition to the transfers of a still image file and a moving image file, both stored in the storage medium.
A video class interface is generally used for the streaming in accordance with USB. The video class interface is prescribed in a specification of “Universal Serial Bus Device Class Definition for Video Devices”. There are Motion Joint Photographic Experts Group (MJPEG), a digital video (DV) format, Moving Picture Experts Group (MPEG) and the like as the formats of images the transfer methods of which are prescribed.
Moreover, when streaming is performed by means of the video class interface, both of an isochronous transfer and a bulk transfer can be used. However, for keeping the continuity of images and sounds and for producing a situation in which a PC can easily identify the timing of a frame change of images, the isochronous transfer is generally used.
In case that the MJPEG format is selected as a subtype (or a moving image transferring format in a video class interface), an audio class interface is used independently of the video class interface when streaming in which audio data is added to images is performed, because the transfer of sounds is not prescribed by the video class interface. In the following, the data transfers of the video class and the audio class, both used for streaming in accordance with the MJPEG format, will be described.
First, an audio data transfer in the USB audio class is described.
In an asynchronous transfer, a data transfer is performed in synchronization with a start of frame (SOF), which is transmitted from a USB host to a device at a fixed period. In the audio class, a camera side is required to surely transmit a fixed amount of data at every reception of a data transmission request, which is transmitted from a PC at a fixed interval on the basis of an SOF.
The data transmission request from the PC is based on a clock on the PC side. On the other hand, the camera side produces audio data on the basis of a clock generated by the camera side itself in place of the clock of the PC. When the frequencies of both of the clocks are quite the same, there are no problems. Actually, an error surely exists between them. Consequently, the amount of data generated per unit time on the camera side and the amount of data read per unit time by the PC differ from each other slightly.
The data to be transferred at the time of streaming is buffered by an audio storing memory for a fixed period of time. Owing to the error between the writing clock and the reading clock, an interval between a data write position and a data read-out position in the audio storing memory changes in proportion to the elapse of the time of the streaming. When the interval is out of a fixed range, a buffer overrun or a buffer underrun occurs, and transfer data breaks down.
Next, a moving image data transfer in the USB video class is described.
Also in the USB video class, the PC transmits a data transmission request to the camera side at a fixed interval on the basis of an SOF similarly to the case of the audio class. However, differently from the case of the audio class, the amount of the data to be transmitted is adjusted to the clock on the camera side.
FIG. 19 is a view showing the transfer timing of video data. Data transmission requests from the PC are based on an SOF, and are always transmitted to the camera side at a fixed period. On the other hand, the camera side which received the transmission requests does not always transmit video data but transmits video data of one frame upon receiving a clock generated at every frame by the camera side itself. It is sufficient for the PC side to update a display frame at every new reception of video data of one frame. By this method, the buffer overrun and the buffer underrun do not occur in the video data storing memory of the camera side.
Until now, the transfers in accordance with the audio class and the video class in the USB have been severally described. In the following, the streaming of a sound and a moving image in a class formed by the combination of the aforementioned two classes will be described.
At the time of performing the streaming of a moving image and a sound in a digital video camera or the like, a moving image pickup apparatus for generating moving image data and an audio pickup apparatus for generating audio data are severally driven by clocks different from each other.
Generally, a digital video camera is equipped with a mechanism for preventing deviation in synchronization between an image and a sound, which deviation is caused by using different clocks to generate an image and a sound respectively.
However, in the case where the USB video class is used for the transfer of moving image data and the USB audio class is used for the transfer of audio data at the time of streaming, the moving image data and the audio data are transferred in accordance with different methods. Consequently, a problem of the deviation in synchronization occurs.
That is, as described above, the amount of the data to be transferred is determined on the basis of the clock on the PC side in the USB audio class. On the other hand, the amount of the data is determined on the basis of the clock on the camera side in the USB video class. Consequently, a phenomenon in which a sounds runs too fast or too late against an image occurs owing to an error between the both clocks. Moreover, the deviation becomes larger as time elapses, and at last the above-mentioned overrun or the underrun of the audio data storing buffer occur to break down the streaming.
A technique related to the above-mentioned problem is described in Japanese Patent Application Laid-Open No. 2000-21081.
The invention described in the Japanese patent application adopts the following method. That is, a host (PC) and a device (camera) count the number of clocks generated between continuous two synchronization signals (SOF) and the number of received data, and feed back the counted numbers to determine the transmission amount of data.
However, such a method has a problem that processing is complicated.