1. Field of the Invention
The invention proposes a novel method of video encoding for video signal captured by wide angle cameras. From now on we will refer video signals captured by wide angle cameras as “wide angle video”. The proposed video encoder can produce bit streams which are compatible with a wide range of video coding standard including the MPEG family of video compression standards.
2. Description of Prior Art
The U.S. provisional patent application Ser. No. 60/467,588, entitled “Multiple View Processing in Wide-Angle Video Camera,” by Yavuz Ahiska (which is hereby incorporated by reference in its entirety) is an example of a camera system producing wide-angle video. Such camera systems are widely used in CCTV surveillance systems. Ordinary video encoding standards cannot effectively compress the video produced by such a camera system because a typical wide angle video contains not only regions of high interests but also large regions corresponding to sky, walls, floor etc carrying very little information. A standard video encoding system cannot give automatic emphasis to regions of interest (RoI) and cannot assign more bits per area to RoI's compared to non-RoI regions of the wide-angle video.
Image and video compression is widely used in Internet, CCTV, DVD systems to reduce the amount of data for transmission or storage. With the advances in computer technology it is possible to compress digital video in real-time. Recent image and video coding standards include JPEG (Joint Photographic Experts Group) standard, JPEG 2000 (ISO/IEC International Standard, 15444-1, 2000), MPEG (Moving Pictures Expert Group) family of Video coding standards (MPEG-1, MPEG-2, MPEG-4) etc. Above standards except J1?EG 2000 are based on discrete cosine transform (DCT) and on Huffman encoding of the quantized DCT coefficients. They compress the video data by roughly quantizing the high-frequency portions of the image and sub sampling the color difference (chrominance) signals. After compression and decompression the high frequency content of the image is generally reduced. The human visual system (HVS) is not very sensitive td modifications in color difference signals and details in texture which contribute to high-frequency content of the image. In MPEG-1 and MPEG-2 standards the concept of RoI its not defined. These video coding methods do not give any emphasis to certain parts of the image which may be more interesting compared to the rest of the image. Only MPEG 4 standard has the capability of handling regions of interest. But the boundary of each RoI has to be specified as a side information in the encoded video bitstream. This leads to a complex and expensive video coding system. Even in simple shape boundaries such as rectangules, and circles, the receiver has to produce an a 1 bit/pixel RoI mask. The size of the RoI mask can be as large as the entire image size. This may be a significant overhead in the compressed wide-angle video which may contain large RoIs. The present invention does not require any side-information to encode RoIs. It can not only provide MPEG 1 and MPEG 2 compatible bitstreams but also MPEG 4 compatible bitstreams which can be decoded by all MPEG 4 decoders. The recent JPEG 2000 standard which is based on wavelet transform and bit-plane encoding of the quantized wavelet coefficients provides extraction of multiple resolutions of an encoded image from a given JPEG 2000 compatible bitstream. It also provides Region-of-Interest (RoI) encoding which is an important feature of JPEG 2000. This lets the allocation of more bits in a RoI than the rest of the image while coding it. In this way, essential information of an image e.g., humans, and moving objects can be stored in a more precise manner than sky and clouds etc. But JPEG 2000 is basically an image coding standard. It is not a video coding standard and it cannot take advantage of the temporal redundancy in video. In non-RoI portions of a wide-angle video there is very little motion in general. Therefore, pixels in a non-RoI portion of an image frame at time instant n is highly correlated with the corresponding pixels at image frame at time instant n+1. The present intention has a differential encoding scheme at non-RoI portions of the video which drastically reduces the number of bits assigned to such regions containing very little semantic information.
Motion JPEG and Motion JPEG 2000 are video coding versions of the JPEG and JPEG 2000 image compression standards, respectively. In these methods plurality of image frames forming the video are encoded as independent images. They are called as intraframe encoders because the correlation between consecutive image frames is not exploited. Compression capability of JPEG and JPEG 2000 are not as high as MPEG family of compression standards in which some of the image frames are estimated from intraframe coded frames by taking advantage of the correlation between the image frames of the video. In addition, a boundary shape encoder is required at the encoder side and a shape decoder at the receiver. The boundary information is preferably transmitted to the receiver as side information. The decoder has to produce the RoI mask defining the coefficients needed for the reconstruction of the RoI (see Charilaos Christopoulos (editor), ISO/11-:C ‘TTCIISC29/WG1 N988 JPEG 2000 Verification Model Version 2.0/2.1, Oct. 5, 198) (which is hereby incorporated by reference in its entirety). Obviously, this increases the computational complexity and memory requirements of the receiver. It is desirable to have a decoder as simple as possible. The present invention does not require any side information transmission to the receiver to encode RoIs.
The US patent with; U.S. Pat. No. 6,757,434 by Miled and Chebil entitled “Region-of-interest tracking method and device for wavelet-based video coding,” (which is hereby incorporated by reference in its entirety) describes an RoI tracking device for wavelet based video coding. This system cannot be used in DCT based video compression systems. However, the present invention can be used in both DCT and wavelet based video coding systems. Also, the RoI information is provided to the receiver as side information in the US Patent with U.S. Pat. No. 6,757,434 (which is hereby incorporated by reference in its entirety).
Another problem with ordinary video encoders is that when there is a buffer overflow or transmission channel congestion problem they uniformly increase the quantization levels over the entire image to reduce the amount of transmitted bits. On the other hand, this may produce degradation even the loss of very important information in RoIs in wide angle surveillance videos. The present invention first increases the quantization levels in non-RoI regions of the video. If the channel congestion gets worse then, it throws away the AC coefficients of the non-RoI blocks and represents them using only their DC coefficients. If this bit rate reduction is not enough then it increases the quantization levels of the RoI blocks as a last choice. In other words, essential information in RoIs of the image is kept as accurate as possible in the case of a buffer overflow or channel congestion.
In the US Patent with U.S. Pat. No. 6,763,068, entitled “Method and apparatus for selecting macroblock quantization parameters in a video encoder,” dated Jul. 13, 2004 (which is hereby incorporated by reference in its entirety) and the US patent application with number 20030128756, entitled “Method and apparatus for selecting macroblock quantization parameters in a video encoder,” dated Jul. 10, 2003 (which is hereby incorporated by reference in its entirety), L. Oktem describes a system adjusting the quantization parameters in an adaptive manner in RoIs. In RoIs quantization parameter is reduced to accurately represent the RoI. However, patents fail to cancel the interframe prediction process in RoIs to further increase the representation quality of the RoIs in a video. Furthermore, the system is not designed for wide angle video.
A number of video compression methods and standards allow the variation of quantization parameters during compression of different portions of the video image frames in order to achieve a target bit rate independent of the content of the source video frame sequence. The disclosed invention differs from those methods because the compression rate is varied according to the content of the video to increase the quality of representation of the RoI in the compressed data domain and a region of interest (RoI) detection algorithm analyzes the image content and allocates more bits to regions containing useful information by increasing the number of quantization parameters and canceling the interframe coding in RoIs_ It is possible to allocate more bits to certain parts of the image compared to others by changing the quantization rules.