After images are acquired by an image sensor within a digital imaging system, the images are typically processed before display or storage on the device. A typical image processing chain or image processing pipeline, or IPP, is illustrated in FIG. 1. The example IPP shown in FIG. 1 includes an exposure and white balance module 2, a demosaic block 4, a color correction block 6, a gamma correction block 8, a color conversion block 10 and a downsampling module 12.
When it is desired to implement a real-time video imaging system, there are often significant constraints with such IPP, because image data is typically read from memory on each stage of the IPP and then written back after some operations. For HD video, the memory bandwidth experiences significant challenges. Thus, it is desired to implement elements of the IPP directly in hardware embodiments in video acquisition devices. This would have the advantage that elements of the IPP avoid the challenge of writing image data to memory after each stage of processing, and reading back the data for each subsequent IPP operation. However, it implies that the methods applied at each stage of the IPP could be less adaptable, as the entire IPP chain would be configured prior to inputting data from a single image frame.
Modern digital still cameras (DSC) implement more sophisticated image and scene analysis than can be provided by a basic IPP as illustrated with some example blocks at FIG. 1. In particular, image acquisition devices can detect and track face regions within an image scene (see U.S. Pat. Nos. 7,620,218, 7,460,695, 7,403,643, 7,466,866 and 7,315,631, and US published applications nos. 2009/0263022, 2010/0026833, 2008/0013798, 2009/0080713, 2009/0196466 and 2009/0303342 and U.S. Ser. Nos. 12/374,040 and 12/572,930, which are all assigned to the same assignee and hereby incorporated by reference), and these devices can analyze and detect blemishes and imperfections within such regions and correct such flaws on the fly (see the above and U.S. Pat. No. 7,565,030 and US published application no. 2009/0179998, incorporated by reference). Global imperfections such as dust blemishes or “pixies” can be detected and corrected (see, e.g., U.S. Ser. Nos. 12/710,271 and 12/558,227, and U.S. Pat. Nos. 7,206,461, 7,702,236, 7,295,233 and 7,551,800, which are all assigned to the same assignee and incorporated by reference). Facial enhancement can be applied. Image blur and image motion, translational and rotational, can be determined and compensated (see, e.g., U.S. Pat. No. 7,660,478 and US published applications nos. 2009/0303343, 2007/0296833, 2008/0309769, 2008/0231713 and 2007/0269108 and WO/2008/131438, which are all incorporated by reference). Facial regions can be recognized and associated with known persons (see, e.g., U.S. Pat. Nos. 7,567,068, 7,515,740 and 7,715,597 and US2010/0066822, US2008/0219517 and US2009/0238419 and U.S. Ser. No. 12/437,464, which are all incorporated by reference). All of these techniques and others (see, e.g., U.S. Pat. Nos. 6,407,777, 7,587,085, 7,599,577, 7,469,071, 7,336,821, 7,606,417 and 2009/0273685, 2007/0201725, 2008/0292193, 2008/0175481, 2008/0309770, 2009/0167893, 2009/0080796, 2009/0189998, 2009/0189997, 2009/0185753, 2009/0244296, 2009/0190803, 2009/0179999 and U.S. Ser. No. 12/636,647, which are assigned to the same assignee and hereby incorporated by reference) rely on an analysis of an image scene. Typically, this involves the reading of blocks of image data from a memory store followed by various processing stages of this data. Intermediate data structures may be stored temporarily within the image store to facilitate each scene analysis algorithm. In some cases, these data are specific to a single algorithm, while in others, data structures may persist across several different scene analysis algorithms. In these cases, image data is moved between image store memory and a CPU to perform various image processing operations. Where multiple algorithms are applied, image data is typically read several times to perform different image and scene processing operations on each image.
For most of the above techniques, analysis may involve a preview image stream which is a stream of relatively low resolution captured by most digital cameras and used to provide a real-time display on the camera display. Thus, in order to properly analyze the main image scene, it is useful to have at least two images of substantially the same scene available. Where one or more preview images are also stored, these are also typically read on multiple occasions in combination with the main acquired (full resolution) image. In addition, processing may involve temporarily storing upsampled copies of preview images or downsampled copies of main acquired images to facilitate various scene analysis algorithms.
Within a digital camera, images are typically acquired individually and a substantial time interval, typically of the order of a second or more, is available between image acquisitions for scene analysis and post processing of individual images. Even where multiple images are acquired in close temporal proximity, e.g., in a burst mode of a professional DSC, a finite number of images may be acquired due to limited memory. Furthermore, these images cannot be processed during the burst acquisition, but often wait until it is completed before more sophisticated scene-based processing can be implemented.
Within a modern video appliance, data is often processed at frame rates of 30 fps or more, and due to memory constraints, the data is digitally compressed and written to a long-term memory store more or less immediately. Furthermore, a low-resolution preview stream is not generally available as in the case of a DSC. Finally, the requirements of handling a full-HD video stream imply that memory bandwidth is challenging within such an appliance.
In order to achieve the benefits of modern scene analysis techniques such as are presently available within a DSC for a HD video acquisition device we can thus identify several key challenges. Firstly, it is difficult to store and perform complex scene analysis on a full HD within the time available between video frame acquisitions. This is not simply a matter of CPU power, but perhaps more importantly a matter of data bandwidth. The size of full HD images implies that it is very challenging simply to move such images through an IPP and into a video compression unit onto long-term storage. While some limited scene analysis may be possible through hardware additions to the IPP, this would likely involve many settings and configurations that are fixed prior to beginning real-time acquisition of the video stream, such that they would not be dynamically adaptable and responsive to ongoing scene analysis.
Secondly, there is no scope to share image processing data primitives between scene analysis algorithms without introducing very large shared memory buffers into the IPP. This would lead to hardware design requirements which are unreasonable and effectively mimic the existing state-of-art, illustrated in FIG. 2, within a single IC. FIG. 2 illustrates conventional hardware to implement an IPP and other high level functions in software. A memory 14 is shown that includes an image and data cache 16 as well as a long term data store 18. The cache 16 can store raw data 20, RGB formatted data 22 and RGB processed data 24, while the long term data store 18 may hold MPEG images 26 and/or JPEG images 28. A sensor 32 communicates raw data to the memory 14 and to the IPP 34. The IPP 34 also receives data from the memory 14. The IPP 34 provides RGB data 22 to the memory 14,16. RGB data 22,24 is also retrieved by the CPU 36 which provides processed RGb data 24 to the memory 14 and RGB data 22 to a transcode module 38. The trasncode module 38 provides data to and retrieves data from the memory 14,18. The transcode module also provides data to be shown on, e.g., a LCD/TFT display 40.
For various practical reasons, this does not provide an optimal image processing mechanism. An alternative is to have separate hardware implementations for each scene analysis algorithm, but this will also lead to very large hardware sizes as each algorithm would use buffer a full image frame in order to perform full scene analysis.
There are many additional engineering subtleties within each of these broad areas, but it is possible to identify a broadly scoped challenge wherein current scene analysis techniques and resulting image enhancement benefits are not sensibly applied to real time video using current state-of-art techniques. An advantageous set of embodiments are therefore provided below.