With the development of image processing technique, the image display is improved from lower definition to higher definition. The amount of data to be transmitted increases significantly with the improvement of the definition, such as from 1280×720 to 1920×1088 or 2560×1600. When the display controller (DC) reads the pixels out from frame buffer at fixed rate, the requirement for the transmission bandwidth as well as the power consumption increases significantly to display high definition images. On the other hand, the requirement for the frame buffer size increases with the growing of image size. Thus, frame buffer compression (FBC) is a trend for image coding and transmission. By frame buffer compression, the transmission bandwidth between the transmit buffer (TX) and the receive buffer (RX) can be reduced. Moreover, the frame buffer size inside a RX device can also be reduced by frame buffer compression.
The algorithm used for frame buffer compression is related to the partition method of an image frame. In order to enhance the throughput for large size image, the image frame is usually split into multiple slices for encoding. The image frame can be divided into vertical slices, horizontal slices or interleaved slices.
FIG. 1A illustrates an example of dividing an image frame into vertical slices. Frame 110 is split into two vertical tiles corresponding to slice 0 and slice 1. This type of partition is referred as vertical partition in this disclosure. FIG. 1B illustrates an example of horizontal partition. Frame 120 is split into two horizontal tiles corresponding to slice 0 and slice 1. This type of partition is referred as horizontal partition in this disclosure. When an image frame is split into interleaved slices, each slice comprises multiple units which are interleaved vertically or horizontally with units of other slice or slices. FIG. 1C illustrates an example of interleaving partitions. Frame 130 is split into interleaved slice 0 and slice 1. The multiple units of slice 0 are interleaved with the multiple units of slice 1. In the method based on interleaving partition, both the encoder and the decoder follow the same interleaving algorithm.
In conventional method, the encoder compresses each slice of the image frame to generate a bitstream, regardless of the partition being vertical or horizontal. The bitstream from each slice may be packed in the sequence from the first slice to the last slice or the bitstreams from the multiple slices may be packed into interleaved segments.
In the case of non-interleaved streams, bitstreams are received in sequence from the first slice to the last slice and no stream of the current slice is received before the stream of previous slice finishes. No matter the image frame is divided into multiple slices by vertical or horizontal partition, the multiple slices in the image frame should be decoded one by one. Thus, the decoding of the next slice starts after finishing the current slice. Therefore, the throughput is limited because the multiple slices cannot be decoded in parallel by a multiple cores decoder without increasing frame buffer size. In non-interleaved streams, larger frame buffer size is required for decoding the multiple slices in parallel to obtain high throughput.
Due to the cost associated with the frame buffer, it is important to avoid the need of large frame buffer size for decoding multiple slices in parallel. Therefore, it is preferred to pack the bitstreams from the multiple slices into interleaved segments. FIG. 2A illustrates an example of the streams generated from compressing an image frame which is split into slice 0 and slice 1. The compressor encodes the image frame to generate a slice 0 stream and a slice 1 stream. The two streams are buffered into fixed size packets to form interleaved segments with each packet stores one segment. In the conventional approach, one or more stream buffers are used to store the complete slice 0 stream and slice 1 stream as shown in FIG. 2A. If the interleaved stream is desirable, the slice 0 stream and slice 1 stream are further segmented into smaller slice 0 packets and slice 1 packets. The slice 0 packets and slice 1 packets are then interleaved to form a desired interleaved stream.
Among the interleaved segments, at least one segment of one slice stream is inserted into another slice stream. FIGS. 2B to 2D show three examples to pack the streams into interleaved segments. Each fixed size packet contains one segment of slice 0 stream or one segment of slice 1 stream. In the example shown in FIG. 2B, each segment of one slice is interleaved with one or two segments of another slice. As shown in FIG. 2B, packet 212 stores one segment of slice 1 which is interleaved with two segments of slice 0 stored in packets 211 and 213. In the example shown in FIG. 2C, the buffer stores one or two segments of one slice and then buffers one or two segments of another slice in the following packet(s). All the packets, such as packet 221 and 222, have a fixed size. The buffer may also store the data streams in another pattern as shown in FIG. 2D. The first fixed size packet stores one segment of slice 0 stream and the last fixed size packet preserves the last segment of slice 0 stream. In the other fixed size packets, each pair packets are used to store two segments for slice 0 or slice 1. As shown in FIG. 2D, packet 231 stores one segment for slice 0. Packets 232 and 233 are used to keep two segments for slice 1. The next two packets are filled with segments for slice 0.
Within each slice, the image data is usually processed line by line or block by block in raster scan fashion. In case of packing slice 0 and slice 1 streams into interleaved segments, the stream data are received with the segments of slice 0 stream interleaved with the segments of slice 1 stream. When the image frame is divided into horizontal slices (i.e., horizontal partition), the multiple horizontal slices can be decoded in parallel by multiple de-compressors. In this case, for each scan line or each row of blocks of the image frame, multiple de-compressors can be used for decoding. However, for decoding each scan line based on vertical partition, only one de-compressor can be used since slice 1 processing cannot be started until slice 0 is finished. Therefore, horizontal partition is preferred for providing higher decoding throughput on each scan line. Moreover, larger reconstruction buffer may be required for vertical partition since slice 1 reconstructed data is not displayed immediately.
In conventional video slice encoding based on horizontal partition, encoder will completely encode one slice and then start to process the next slice. The image frame is encoded slice by slice and in each slice the coding blocks are compressed row by row. FIG. 3 illustrates an example of conventional video slice coding order based on horizontal partition. The image frame is coded based on 16 blocks ai,j in which i represents the row number and j represents the column number. The 16 blocks are divided into two tiles, i.e., slice 0 and slice 1. The encoding of slice 0 starts from block a0,0 and following the order shown by the arrows in slice 0. After finishing slice 0, the encoder processes slice 1 from block a0,2 to coding block a3,3 following the order illustrated by the arrows in slice 1. The slice coding order is different from the nature order of display interface.
However, the tile base coding order of horizontal slices is different from the nature order of display interface. In the decoder side, the reconstructed image frame is displayed row by row in the whole frame. FIG. 4 illustrates an example of display order. The reconstructed blocks are displayed in the order shown by the arrows. Due to the requirement of interleaved segments for delivering to the de-compressor side, slice 0 stream will be totally buffered until all the pixels in slice 0 are compressed by the encoder. Thus, the stream buffer has to be large enough to store the entire stream of slice 0.
Therefore, it is desirable to develop frame buffer compression so that the stream buffer size and/or the latency can be reduced without noticeable impact.