The advent of modern computer systems spurred the development of imaging systems. Imaging systems are used in such diverse fields as military intelligence photography, astronomy, geology, agriculture and medical diagnostic imaging. This invention while finding utility in other and related imaging systems was particularly developed for that branch of diagnostic imaging generally known as digital fluorography. Although the following discussion of the background and the following description of the invention deal mainly with medical diagnostic imaging systems, the same problems occur in other systems with large amounts of data and therefore the invention is applicable to such systems, whether the data is arranged one-dimensionally, two-dimensionally or multi-dimensionally.
As computers progressed in speed and capacity, so did the amounts of data used per image. Most images are arranged in rectangular (or even square) matrices and their size can be specified by their matrix dimensions. In medical imaging, for example, image size has grown from 32.times.32, ten years ago, to 512.times.512 or even 1024.times.1024 today. This thousandfold increase in data amounts is faster than the rate of decrease in the price of memories of all types, and data amounts keep growing. These amounts of data raise a number of problems.
In digital fluorography as in other branches of diagnostic imaging the data storage space and the time for transferring data to and from the temporary stores of the computer itself are critical factors in the imaging system's efficient operation.
For example the amount of data that can be stored in the Random Access Memories (RAMs) of the computer systems used is extremely limited. The RAMs are expensive and therefore increasing the data capacity by increasing the capacity of RAMs is an expensive proposition. In addition with the RAMs there is the ever present danger of losing data, since the RAMs require power to maintain the data. This characteristic makes the RAMs expensive to operate since they are drains on the power system.
Therefore imaging systems generally store the data on memory systems such as magnetic tapes or disks as soon as possible. Such memories hold more data than the RAM type memories and the retention does not require power. Among the drawbacks of such memories are that it takes longer periods of time to on-load and off-load disks or tapes than to transfer data to and from RAMs. Also while the storage space in disks and tapes is much greater than that of RAMs, nonetheless it is limited.
Low resolution imaging systems are systems with low resolution requirements that can store data analogically. Such systems can usually store data directly as it is acquired. Slow imaging systems are systems with low speed requirements; that is they acquire data at a low rate. Such systems can usually store data directly as it is acquired, even digitally and thus do not need large internal RAMs. Fast, high resolution imaging systems either use very fast and very expensive disks or first acquire the data and store it in the RAMs from where the data are transferred to the external memory, for example, for long term storage. The long time period required for on loading the disks makes it necessary to either use buffer memory devices such as disclosed in the patent application entitled "Buffer Memory Systems" filed in the United States on Apr. 21, 1983, bearing Ser. No. 487,312 and assigned to the assignee of this invention; or, the use of more RAMs which as noted is expensive. Alternatively, continuing the data acquisition processes without storing all of the data results in exposing the patient to unnecessary radiation.
To increase the always limited storage capacity and to speed the transfer of data to the permanent storage systems, data compression systems have been used. Compression as used herein means transforming the data to reduce the size of the storage needed for the amount of data to be placed, either in the temporary, in the short term or in the long term storage. Storage size is measured by the total number of bits (binary digits) necessary to store the data in its current representation. The efficiency of a compression depends on the ratio of the necessary storage size before compression to the necessary storage after compression (compression ratio). For an example of systems for reducing the necessary storage size see the patent application entitled "Super Interlacing System" filed in the United States on Nov. 14, 1983, bearing Ser. No. 551,698 and assigned to the assignee of this invention.
Other compression methods used in imaging systems in the past include circle cutting, delta modulation and Huffman codes, string length coding, etc.
A short description follows of two prior art compression methods, to aid in obtaining a better understanding of this invention:
a. String length coding replaces strings of identical values by the value followed with the length of the string (or the length and then the value). This is effective if the length needed to write the coded value is shorter than the length of the average string. For example, if information is usually coded in 4-bit units and the number zero appears in strings the maximum length of which is 20,000; then 15 bits are needed to make sure that the number 20,000 can be written. Where units of 4 bits are to be retained, then 4 units have to be reserved in the code for storing the string length. The total code is therefore 5 units of 4 bits. The code is efficient if the average string length is greater than 5. Every number can be thus coded, or only given ones, that are expected to come in long strings.
b. "Replacement" coding (of which "Tree" codes and specifically the Huffman code are examples) replaces every number with a code value.
In general, the more common a value is in the image, the shorter is its coded value. The commonest values have coded values shorter than the uncoded size, while the least common values have perforce, coded values longer than the uncoded size.
The code is built using the statistics, i.e., the distribution of the data values, in such a way that the storage size needed for the code of the total image is less than the original storage size. This code is effective if built separately for each image, according to its specific statistics but loses effectiveness rapidly if the statistics change, e.g., if used for a different image with different statistics. A replacement code that is universally effective is virtually impossible, and a standard code that is used for a given range of statistics is usually not very efficient. We shall use the term "efficient replacement code" to denote a replacement code that uses the statistics to obtain a compression ratio that is optimal or close to optimal, as described above. We shall use the term "quasi efficient replacement code" to denote a replacement code replacing the commonest value by a code value other than the shortest, i.e., by a code value longer than would be applied using an efficient replacement code, but otherwise following the general rule above.
Certain operations or mechanisms can be used to enhance the efficiency of compression. Consider, for example, the "difference" method, or the "delta" modulation compression method (the different names apply to essentially the same method and arise from the different fields where the method was applied, in digital and analog signal processing). There, instead of looking at the values, one looks at the differences between the values in adjacent elements (pixels, if performed on an image). As the objects being imaged rarely change much within the resolution of the imaging equipment, these differences between the pixels are usually much smaller than the pixel values themselves. In many cases, such as in computerized-tomography (CT) images units of fewer bits may be used to store the differences than are needed for storing the original values (e.g. 8 instead of 12). However, noise reduces the efficiency of the method.
A variation of the difference method deals specificly with regions where there is no data. These regions may still contain noise. The knowledge that there is no data may be utilized to advantage. For example, these no-data regions may be ignored; alternatively, they may be filled with a constant value to replace the noise, thus making all differences between neighboring elements equal zero.
A problem that has to be carefully treated when using data compression systems is the loss of accuracy that often occurs when operating on the data to compress it for storage and expand it for use.
Some compression methods take into account the fact that the data is accurate only to some limit by intentionally discarding information within the proscribed limit of accuracy. Any changes under that limit of accuracy are due mostly to noise and if real, are masked by, or buried in the noise. The term "noise" here refers to random changes due to many causes, instigating inaccuracies in data values if taken separately. For example, count data, used in nuclear medicine, is accurate only to its own square root. That is, if the value measured in n, the "true" value has a probability P(m) to be m that is Gaussian, centered around n and spread with standard deviation equaling the square root of n; so that: ##EQU1## Alternately, if n is the "true" value, P(m) is the probability of sampling (or measuring) the value m instead of n. If the count is 100 then the standard deviation is 10, therefore there is little information lost if the value 100 is stored as, say 99 or 101.
Some compression aids and enhancement methods, such as the "difference" method described above, are susceptible to noise, which reduces the efficiency of the method. However, any reduction of the aforementioned noise to increase the method's efficiency incurs some information loss. Actually, compression methods allowing "information loss" may be very efficient, provided they sufficiently reduce the noise.
In the above nuclear medicine example, if the expected range of values is from 0 to 255 then 8-bit units are needed to store the data. The noise has a standard deviation in the range of 0-16 and there is 95% probability of the noise in a particular element being within 3 standard deviations. The average standard deviation of the noise in the image is the square root of the average value, or about a factor of the square root of two smaller than 16. Applying the difference method to the image implies doing subtractions. The subtraction operation increases the standard deviation by the same factor of the square root of two, making the average standard deviation of the noise in the differences equal 16. Therefore, 95% of the differences would be in the range -48 to +48, even when there is no change in the object imaged. This range is already the same as 0-96 and requires 7-bit units, which is no great gain over the original 8-bit unit storage. Smoothing this image, while theoretically causing some loss of information, does not adversely affect the image as to reliability of information (it may even improve detectability). At the same time, by reducing the standard deviation of the noise by a factor of, say 3, it brings the changes to a range that may be stored in 5-bit units, thus improving compression efficiency.
However, while information loss may be acceptable in many imaging fields, in the medical diagnostic imaging field the physicians object to any information loss. Also, in some diagnostic imaging modalities such as in digital fluoroscopy noise levels are so low that not much is gained by "information loss", unless of a specific nature, having other advantages.
The prior art compression systems used in the past do not provide compression rates in the range of 3 to 1 without a serious loss of accuracy when applied to, say, digital fluorographic images with 512.times.512 matrices. There are several uses for which such compression rates with digital fluorographic images having 512.times.512 matrices.
Accordingly there is a serious and pressing need for efficient data compression methods and systems for use in imaging systems.