The present invention relates to image encoding and decoding techniques, and more particularly, to a method and apparatus for generating a partial image from a larger compressed image (or a plurality of individual images).
Due to the limited bandwidth of transmission channels, there are a limited number of bits available for encoding image information, such as image information generated by a camera for transmission to one or more remote users. Thus, there are many image encoding techniques available which encode the image information with as few bits as possible using compression techniques, while still maintaining the quality and intelligibility that are required for a given application.
Remote cameras, such as those used for security applications, traffic monitoring or daycare monitoring, are typically panned by physically moving the camera. In addition to the possibility of a mechanical failure, the utility of such remote cameras is limited in that only one user can control the camera at a time. For multi-user applications, however, such limited user control of the camera view is not practical. A number of software techniques have been developed for permitting a number of users to view selected portions of a larger image (or a composite image generated from a plurality of individual images).
Permitting multiple selected views of a larger image, however, becomes more difficult if the larger image is compressed. Specifically, since image data following image compression is of variable length, pixel boundaries are not readily detectable in a compressed image. In addition, since many encoding techniques exhibit intra-frame pixel dependencies, such as encoding the difference values for adjacent DC coefficients under the JPEG standard, the pixel values must be modified when generating a selected portion of a larger image, to reflect the reordering of the subset of pixels in the selected image view.
Typically, when generating a selected portion of a larger compressed image, the larger image must be decompressed into the pixel domain, before the pixel values are reordered and assembled to create each of the selected image views. Thereafter, each of the selected image views are compressed to form the final images transmitted to each user.
The more popular image compression techniques, such as JPEG and MPEG, typically perform three steps to generate a compressed image, namely, (i) transformation, such as a discrete cosine transform (DCT); (ii) quantization; and (iii) run-length encoding (RLE). Likewise, to decompress images using these same image compression techniques, the inverse of the compression steps are performed by the receiver on the compressed image, namely, (i) run-length decode; (ii) dequantization; and (iii) inverse discreet cosine transform (IDCT).
Thus, to create N selected image views from a larger compressed image, conventional techniques require one image decompression, N pixel reorderings, and N compressions. In addition, the processing capacity and bandwidth required to support a unique view for each user in a multi-user application often exceeds the capacities available with current technologies.
Generally, a selected image view generator for generating unlimited selected portions of a larger compressed image is disclosed. A virtually unlimited number of users is supported by generating selected image views that have a wider angle (in a panoramic view) or a larger area (for conventional images), than that requested by the user. The wider angle (or larger area) images are referred to as xe2x80x9cinflated images.xe2x80x9d Preferably, only the portion of the inflated image that was requested by the user is displayed.
The larger compressed image includes a plurality of macroblocks of image data, encoded using an intraframe encoding technique that encodes the macroblocks independently. The selected image view generator includes a device for storing the larger image; an input for receiving an indication of the selected image view from a user; and a processor configured to (i) identify an inflated image including the selected image view and one or more additional macroblocks of image data; (ii) identify the macroblocks included in the inflated image view; and (iii) assemble said identified macroblocks to form said inflated image view. In addition, the selected image view generator optionally includes an output for transmitting the inflated image view to a user. In one embodiment, the larger image is comprised of a plurality of predefined overlapping inflated images, and the processor identifies the inflated image that includes the selected image view. Another inflated image view is selected for the user when the selected image view is not supported by the current inflated image.
According to an aspect of the invention, multiple users can simultaneously control a selected view received from an image source. The overall image may be comprised of one or more static or real-time images. The selected image from a larger overall image may be used, for example, with a 360xc2x0 panning camera, to permit each user to select a desired view. For example, different viewers of a tennis match, can watch different players from the same video feed.
The larger compressed image may be encoded using a suitable intra-frame macroblock-based image encoder, provided that each macroblock is encoded independently to ensure that the correlation between DC coefficients is restricted to within a given macroblock. Independent macroblocks may be achieved within the JPEG standard, for example, by initiating the Restart interval for each macroblock. Each macroblock optionally contains a macroblock identifier. Each macroblock identifier initially indicates the position of the macroblock in a given image. The macroblock identifiers are optionally renumbered by the selected image view generator for each inflated image view to indicate the position of the macroblock in the inflated image view. Likewise, the macroblock identifiers are optionally renumbered by the selected image view generator for the selected image view within the received inflated image to indicate the position of the macroblocks in the selected image view. In addition, each transmitted overall image and inflated image view optionally includes a frame header indicating the number of macroblocks in the transmitted image.
An illustrative panoramic image can be horizontally partitioned, for example, to create a plurality of overlapping inflated images. Thus, if the larger image is partitioned into a plurality of predefined 65xc2x0 inflated images which repeat every 20xc2x0, for example, and if a user selects a 45xc2x0 image view, then the 65xc2x0 inflated image containing the selected image view can be transmitted to the user. When the user selects an image view of the overall image that is not supported by the current inflated image, the next adjacent inflated image in the direction in which the user is panning is transmitted to the user. Thus, only a limited number of inflated images needs to be supported by the selected image view generator. A given inflated image may be multicast to each user for which the selected image is within the inflated image. By supporting a limited number of inflated images, each user can select a unique image view from all other users, but not require a dedicated image transmission for each selected view.