As it is known in the multimedia art, video data may be captured for real-time display on a computer graphics device coupled to a host central processor unit (CPU). A problem arises because the timing and the format of video input data may vary from the timing and display format of the host CPU, yet it is a goal to display the video image in real-time with minimal visual artifacts. In order to increase the overall quality of the video display, it is important to insure that the interface between the video capture device and the graphics device is optimized to provide the highest possible data transfer bandwidth.
The typical data flow path of video data is as follows. Video data is `captured` by a video capture unit and stored in a memory. The capture video data is received in block format, yet must be translated to interleaved format for display. The video data is then transferred from that memory to a graphics controller. The graphics controller feeds the video data to a frame buffer memory, which stores the image data that is to be displayed on the computer monitor. According to one known topology, the video capture unit, memory and graphics controller, as well as the host CPU may all be coupled together via a high speed bus such as the Peripheral Component Interconnect (PCI.TM.) bus. The PCI.TM. bus is capable of transferring data at a bandwidth of 133 Mhz.
All of the peripheral devices that are coupled to the PCI bus must operate according to the PCI protocol. As such, each coupled device includes interface logic for controlling reads and writes of data between various components coupled to the bus. The video capture unit, which may be coupled to a video compression/decompression (CODEC) unit, or may be an independent device such as an USB interface that receives digital data directly from a USB camera similarly must include logic for transferring video data over the PCI bus to system memory. In addition, the interface logic of the video capture unit must also include synchronization logic for allowing for coherent communication of data from the clocking domain of the peripheral device to the PCI clock domain.
Thus there are at least three design considerations that must be dealt with when interfacing video capture units to a host computer system; optimizing PCI bus bandwidth, providing appropriate conversion of block format data to interleaved format data, and synchronizing the data transfer between the video and PCI clock domains. Historically the above goals have been met using data buffer management techniques.
Typically, data buffer management is performed in hardware using dedicated RAM memory or register file arrays to buffer the data, in combination with synchronization logic for transferring ownership of the data from one clock domain to the next. A typical read sequence may include the following steps: 1) data is written to the dedicated buffer from control logic operating in the video clock domain 2) a `data available` control signal is passed through a synchronizer into the PCI clock domain 3) Data is then read out of the dedicated buffer by control logic operating in the PCI time domain.
Data format conversion functionality is typically performed either entirely in software or through a combination of software and hardware. The conversion techniques usually involve byte manipulation of data structures stored in memory under CPU/software control. By using software to provide format conversion, there is an advantage that no additional hardware cost, in terms of gates, is incurred. However, it presents certain performance disadvantages, since compute cycles are wasted performing relatively elementary byte manipulation tasks.
A second problem arises with the above technique because of the delay associated with the data synchronization method. While the PCI bus is waiting for the data, other data is still being received by the video capture unit. In order to insure that no data is lost, the size of the data buffer may have to be quite large. One drawback associated with data buffering is the inherent fact that data buffers use valuable real estate of an integrated circuit or board, thereby leaving less room for other logic and the routing of data signals. Thus it is always desirable to minimize the size of the buffers that are used for transactions.
Therefore it would be desirable to provide a method and apparatus for maximizing transfer bandwidth between a peripheral device operating at an peripheral clock frequency and an internal bus operating according at an internal clock frequency while minimizing the hardware expense.
A further design consideration is encountered when the buffered data is forwarded for storage in a memory on the host computer. The data that is to be stored is stored in a converted format. However, there are a variety of conversions that may be performed on video data depending upon the sub-sampling criteria that is used on the input data. For example, pixel data may be stored in either 4:2:2 format, with four luminance pixels stored with two pairs of chrominance data, or 4:2:0 format, with four luminance pixels stored with one pair of chrominance data. In order to maximize memory space usage, it would be desirable to optimize the storage for each of the conversion formats of data. However, because it may be desired to read the data out of memory in the unconverted, block format, historically multiplexers were added to the data types to accommodate the various conversion types. Such a design is often undesirable, since minimization of area is always a design goal.