The present invention relates to a process and device for quantifying, analyzing, interpreting, enhancing and representing in computer generated image format, medical ultrasound and other video images. These functions can be divided into two categories, image processing and image analysis. Image processing functions include image enhancement and presentation. Image analysis functions include quantifying, analyzing and interpreting the characteristics of the enhanced or unenhanced image.
Images generated by medical ultrasound scanning devices present unique problems for image process and analysis systems. Ultrasonic scanning devices use sound transducers to introduce high frequency sonic waves into the body, either through a hand-held device pressed against the body or through a specially designed transducer inserted into a body cavity such as a rectal cavity. Elements in the body reflect the sonic waves back to the transducer according to the reflection coefficients of the elements. The measured time between emission and detection of the sonic waves is proportional to the depth of the sonic wave reflecting element within the body. A visually projectable image of the reflective elements in a plane of the body can be generated by assigning reflected wave signals a gray scale value in proportion to the reflected wave amplitude, passing the signals through a variable gain amplifier to compensate for attenuation losses as the wave reflecting elements increase in depth, and displaying the results in two dimensions. The two dimensional display corresponds to a plane in the body parallel to the direction of wave travel. Bodily elements in the display can be recognized by trained observers. The display can be a moving image by generating and displaying a series of repeated images on a video monitor. This process of generating and interpreting images using ultrasonic transducers and processing means is known as sonography.
The reflective characteristics of wave reflecting elements in the body are referred to in sonography as the "echogenicity" of that area of the body. A highly reflective element would appear bright in the image and is called "hyperechoic," while an element with low reflectivity would appear dark and is called "anechoic." The mixture of hyperechoic and anechoic features in a localized area is termed the "echoic texture" of that area. A uniform set of features with similar reflective coefficients is called "isoechoic." A non-uniform set of features with a broad mix of reflective coefficients, which would appear as a speckled pattern in the image, is called "hypoehoic."
The primary cause of the speckled pattern in the image is that sonic waves do not always follow a direct path from and to the transducer, but instead may be reflected off several curved or angular reflecting surfaces causing small variations in the amplitude of the reflected wave. Since the displayed gray scale value of each "pixel" (picture element) is derived from the amplitude of the reflective wave, this variation produces speckle similar in appearance to snow in a standard television image. Although speckle is not random as is snow in a standard television image, the exact form of speckle in an ultrasound image is virtually impossible to predict because of the extraordinarily complex configuration of body tissues.
Speckle accounts for over 90% of the contents of many ultrasound images, and has been considered a major cause of the poor quality of those images. Because speckle clouds the image and resembles snow in a television image, it is treated as noise. However, from the explanation above, it can be seen that the characteristics of speckle directly relate to the physical and echoic structure of the tissue being scanned. Thus, existing methods that suppress speckle also suppress valuable information regarding the tissue.
For example, it has been found that the several regions of cancerous tumors of the prostate gland have fairly characteristic echoic textures during the various growth stages. This phenomena is discussed somewhat in a scholarly article entitled "The Use of Transrectal Ultrasound in the Diagnosis, Guided Biopsy, Staging and Screening of Prostate Cancer," published in Volume 7, Number 4 of RadioGraphics, July, 1987. However, prior to the present invention, ultrasonic images were of a quality and resolution too poor for reliable diagnosis based upon echoic textures. Further, no procedure had been devised for the accurate quantification of echoic texture. Instead, diagnosis relied mainly on the experience of the operator.
Apart from methods for analyzing speckle, there are many existing devices and methods aimed at suppressing noise in video images. These devices and methods are primarily for use in standard television images in which noise is manifested as discreet light or dark random spots a few pixels in diameter. Most of the existing methods and devices are not specifically directed toward the unusual problems encountered in ultrasound images. In fact, these methods often suppress speckle information critical to interpreting and analyzing ultrasound images.
Television noise reduction systems may be broadly classified into two types. One type of system relies on the fact that television noise is generally random and, moreover, is short-lived at any given pixel in the picture. This type of system stores in a machine memory the gray scale value (and chrominance in the case of a color picture) of each pixel in the picture, over a period of time comprising several sequential pictures. The system then compares the gray scale value of each pixel with its gray scale value in immediately preceding and succeeding pictures. In the event the pixel is transmitting noise, then it is likely that the comparison will show an abrupt change in gray scale value in that pixel between adjacent pictures. If the comparison does indeed show a change in excess of a predetermined amount, then the system will alter the gray scale value of the aberrant pixel by an arithmetic or weighted average with its gray scale value in adjacent pictures. Examples of such approaches are described in U.S. Pat. Nos. 4,064,530 by Kaiser, et al.; 4,504,864 by Anastassiou, et al.; 4,539,594 by Illetschko; and 4,485,399 by Schulz, et al.
A problem with these systems is that they depend on the picture content remaining stable for at least a period of several frames. Since standard television video uses 30 frames per second, the picture content must be stable for a tenth of a second or more, depending on the number of frames used in the frame comparison. For example, consider a picture content with a dark object moving from one side of the screen to the other against a light background over a period of one second. If the screen is the standard 512 by 512 pixels, then the border of the dark object will move at the rate of 512 pixels per second. If the system compares 4 frames and each frame appears for the standard 1/30 of a second, then the object will have moved a number of pixels during the four-frame reference period calculated as follows: EQU (1/30) 4 (512) =68
The object border will be blurred over those 68 pixels because the system will erroneously average the gray scale values of those pixels while they are displaying the object, with the gray scale value of those pixels while they are displaying the background.
Some systems attempt to avoid blurring the borders of objects moving in the picture by suspending the averaging process or altering the weighing of the average during periods of motion in the picture. Such a system is described in U.S. Pat. No. 4,242,705 by Ebihara. However, such approaches necessarily require a diminution in the noise reduction process during the period that the picture is showing motion.
A second approach of existing noise reduction systems relies on the fact that ordinary television noise is generally only a few pixels in diameter and has a high contrast with adjacent pixels. These systems seek to compare the gray scale value of each pixel with the gray scale value of adjacent pixels. If the difference is greater than a predetermined amount, then the system smooths the difference by adjusting the gray scale values of the pixels. The predetermined amount that triggers the adjustment may be varied depending on the overall or localized luminescence of the picture, the degree of motion in the picture, viewer preference and other factors. For example, see U.S. Pat. No. 4,361,853 by Remy, et al.
While these smoothing systems are effective in reducing noise, or at least in reducing high-contrast noise, they have several drawbacks. A serious drawback is that they reduce the contrast between actual objects in the picture as well as reducing the contrast of noise, since they are unable to distinguish between noise and objects. This produces a lack of sharpness in the picture which is particularly noticeable when the picture contains alphanumeric characters or other distinctive objects presented against contrasting backgrounds. A further drawback with smoothing routines is that they operate indiscriminately over the entire picture, regardless of the picture content. This indiscriminate operation requires long processing times and large memory capacities.