1. Field of the Invention
The present invention relates generally to the application of a video compression concept of reference frames and daughter frames to optimizing transmission and/or storage of video images and to other applications of highly parallel processing of image-related data.
2. Background Art
One of the most critical techniques influencing video RF performance is lossy compression of video imagery data, or payload, under conditions of a noisy RF-wireless communication channel. Such a channel is usually defined by external network protocols that operate under certain prioritization, factors that define available bandwidth, BA, for the specific channel, which, in general, changes in time. Also, such bandwidth can be defined by available RF Tx/Rx-equipment. Sometimes, only part of BA is used, denoted as Bu, or used bandwidth; giving in general:Bu≦BA 
Typically, as below, we assume the equality sign in this equation, i.e., Bu=BA, without loss of generality.
Typically, the video Compression Ratio, or (CR) is a high number; e.g., (CR)>100:1, but it can also be (CR)>1000:1, or even: (CR)>4000:1 (see Reference Nos. 1-2). This is, because, human visual perception is very tolerant of motion errors. In contrast, for static, or still imagery, the relevant lossy compression ratio is much lower, up to 20:1, for almost lossless compression, such as perceptually lossless wavelet compression for medical imagery. This method is an enhanced version of Malat's wavelet method, heavily supported by “supercomputer-in-a-box” image processing hardware, called PUMA (Processing and Ultra-Memory Access), as in Reference Nos. 3-5. In general, PUMA is hardware adopted from motion-error analysis, or graphic-IC hardware (see Reference Nos. 3-6). Such hardware is FPGA, or ASIC-type, and is in the form of a highly-parallel chipset, such as 256-processors, in parallel, with a low-level voltage supply. Such very compact (2″×3″) chipset performs a high-speed operation, but only of a specific type, such as simple algebraic operations: subtraction, addition, square, etc. These simple algebraic operations, however, are sometimes sufficient for a more complex function, such as image compression, or pattern recognition, called ATR in the military (Automatic Target Recognition). For example, if we base our pattern recognition on template matching, then the PUMA will be ideally suited to such an operation.
The above background is helpful to understand the present invention, in the context of so-called soft computing. There are a number of soft computing meanings (see Reference No. 14), but we choose here the following definition: Soft Computing and Soft Communication, or SC2, are techniques, based on statistical still/video imagery analysis in real-time, which means that many identical operations must be performed at the same time, or highly parallel; see Reference Nos. 1-8.
Two basic video compression techniques are of major interest for the purpose of this invention: MPEG and wavelet. Both are known in the literature. MPEG (Moving Pictures Experts Group) has defined a motion-picture compression standard, in the form of MPEG-1, MPEG-2, MPEG-4, and MPEG-7, to mention the most successful standard options. All of them, but especially MPEG-1, and its more standardized version MPEG-2; are based on the concept of the so-called I-frame which is a form of reference frame which is used as a base to generate related frames, sometimes called daughter frames. In the case of MPEG-1, there are specifically fourteen (14) daughter frames, so a total cycle has 15 frames: one I-frame and 14 related frames, called here R-frames. The essence of MPEG compression is that it is not necessary to send full information about all frames, since they are essentially similar in short periods of time, such as 0.5 sec, or 500 msec, which is the total time of the MPEG-frame cycle of 15 frames, adjusted to the American standard of “full-motion,” which is 30 frames per second, or 30 fps. Based on that is so-called VGA-video standard which assumes such image-formats as 640×480 pixels, or 740×480 pixels, or others, and RGB-colors which is 24 bits per pixel, or 24 bpp, and 8 bits per color. Therefore, the total VGA bandwidth is, for example, (640×480)×(24)×(30)=221 Megabits per second, or 221 Mbps, which is the payload bandwidth. In the wireless, or wired network cases, however, we also have bits other than payload bits, such as header bits, or redundant bits, necessary for bit-error-correction, such as Forward-Bit-Error Correction, or others. In the network case, such as ATM (Asynchronous Transfer Mode), or others, those bits are organized in the form of so-called streaming video, into packets. Each packet consists of payload and header. For example, for the ATM network we have a 53 byte packet with a 5 byte header (1 byte is equivalent to 8 bits).
The video payload bits can he heavily compressed; thus, reducing original, or raw bandwidth, Bo, by the (CR)-factor. For example, for a VGA-original (payload) bandwidth of 221 Mbps, and for a (CR)=1000:1, the compressed bandwidth, Bo(CR)=221 kbps.
The second type of compression considered here as important, is wavelet compression, typically applied for still imagery. Here, however, it can also be applied either as real-time video compression, or as I-frame compression.
The basic contribution of relevant prior art is found in two U.S. Pat. No. 6,167,155, and U.S. Pat. No. 6,137,912, both assigned to the assignee of this application. They present the concept of meaningful, or significant I-frame, called here M-frame, which is introduced whenever needed, in contrast to the conventional MPEG, where I-frames are introduced, periodically, in equal 0.5 sec-intervals. This patented concept introduces I-frames, called M-frames, at any time, whenever the motion error, defining the difference between an M-frame and a given daughter frame exceeds a predetermined threshold value. Such a meaningful, or significant M-scene defines, automatically, a cycle of daughter frames, following this specific M-frame, until a new significant M-frame is introduced. We call this cycle of daughter frames, “belonging” to the specific I-frame, an “M-scene,” since, these daughter frames correspond to the single M-frame, because of their small difference from the mentioned M-frame. In a given moment, however, about 6 sec on average, some daughter frame will become sufficiently different from the M-frame, that it will become a new M-frame, defining the next M-scene, etc. Such a situation can happen in a movie when indeed “a new scene,” occurs, or because something significantly new arrived in the video frame of a surveillance camera, such as a new moving object, or because a camera scan projected a new scene, or because of sudden motion of a camera. The reasons for changing a scene can be many, but we always have a new M-frame which is “a leader” of its following daughter frames. Their number is not always fourteen (14) as in the case of the conventional MPEG, but more or less than 14, and this number is dynamically changing. For slow motion, or for high elevation surveillance, where the objects on the ground seem to move more slowly, such M-scenes will be “longer,” than for fast motion, or full motion situations, such as action movies, which, indeed, require high video rate, such as 30 fps, or even higher. Also, a new type of surveillance by low elevation UAVs (Unmanned Aerial Vehicles), will be a fast motion situation, where video scenes, or cycles, will be rather short. It seems logical that such motion-adopted MPEG compression, as defined in U.S. Pat. No. 6,167,155, naturally organizes streaming video frames, into cycles, or “scenes”; each with single higher-hierarchy “Meaningful” I-frame or (M-frame), and a some number-n, where n-integer, n=0, 1, 2, . . . , of lower-hierarchy daughter frames, that can be organized as MPEG-1 frames: Io, Boo, Boo′, Po, BO1, BO1′, P1, etc.; see Reference No. 1 for more details. Other organizations are also acceptable. More important is that we have a new hierarchy: “one vs. many.” We can consider these special I-frames, or M-frames, as spatial events, and daughter frames as temporal events. Such distinction is justified by the fact that, by definition, “spatial events,” or static/still images present a “frozen” situation, defined by such features as high resolution, and complex color, but also low motion, or no motion. In contrast, “temporal events” are characterized by high motion, and high color but low resolution. Therefore, in addition to continuous storage of streaming video as in typical surveillance camera systems, it would be convenient to provide extra storage of M-frames, by applying some higher quality compression techniques such as wavelet, for example. Such storage contains high quality M-frame images, including high-resolution images obtained from mega-pixel cameras, if available. Such extra images can be used later for advanced ATR.