The present invention relates to the identification and characterization of boundaries in digital images and other digital data sets. More particularly, the invention relates to computer-implemented tools and techniques for detecting and characterizing boundaries such as the boundary between normal and abnormal tissue, between desired and undesired foliage, between original and airbrushed picture regions, and a wide variety of other boundaries.
The terms xe2x80x9cedge detectionxe2x80x9d and xe2x80x9cboundary detectionxe2x80x9d are used interchangeably herein to describe tools and techniques for locating boundaries of meaningful regions. A meaningful region is defined by relatively uniform color, gray level, texture, hue, or other characteristic(s). The edges between regions may be characterized by shape as delta edges, step edges, or crease edges, for example. Edges may also be characterized by magnitude, both as to the xe2x80x9cheightxe2x80x9d of the intensity difference and as to the speed or acceleration of that change. Edges may also be defined in terms of various mixtures of these and/or other characteristics. The regions of interest may be separated spatially, temporally, or both.
At least the following tools and techniques are known, at least individually, for use in edge detection and/or image processing generally: gradient operators, such as Sobel operators; forward differencing; convolution; thresholding; histograms which graph the frequency of occurrence of intensity levels in an image; noise reduction; cluster analysis; spatial differentiation generally, and gradient computations using neighborhoods of variable size in particular; pattern matching for edge detection; edge detection in color images by thresholding the sum of the differences of each feature to produce a binary image indicating whether the total color difference was above the threshold, or in the alternative by computing the sum of binary images given from differentiation and thresholding of each color; tracking and other methods of linking edge elements into a longer contour; detection of multiple straight lines from a set of edge points with a clustering method such as the Hough transform; vectors; vector valued functions; metrics defined over function spaces; Gaussian and other distributions; and the use of analytic or numeric first and second derivatives to find edges.
It is also generally appreciated that boundary detection may be useful in many ways. Possible applications include robotic vision, medical image processing, military intelligence, satellite photo analysis, defect detection during manufacturing, and many others. However, the usefulness of a given tool or technique for boundary detection in a given context depends on many factors, not least of which are the reliability and sensitivity of the tool or technique in identifying and/or characterizing boundaries. Computational efficiency is also important, but it tends to become less of a limiting factor as computational devices grow increasingly powerful and less expensive.
Accordingly, it would provide advancements in the arts to provide new tools and techniques for boundary detection. In particular, it would be useful to combine well-understood conventional tools and techniques in novel ways to detect and/or characterize boundaries which are not readily identified or analyzed using previously known approaches.
Such novel tools and techniques are described and claimed here.
The present invention provides improved tools and techniques for detecting and/or characterizing boundaries in digital data. One embodiment according to the invention, which is tailored for use with two-dimensional pixel image data and tailored in other ways as well, proceeds as follows.
A first image point p00 having coordinates (x0,y0) is selected. The coordinates (x0,y0) are in the domain of a function which maps spatial coordinates to pixel values. A first cluster c00 of sample image points distributed around point p00 is then chosen. The cluster c00 is in the range of a function which maps from {a spatial coordinate such as (x0,y0), a distribution such as Gaussian distribution, distribution parameters such as the variance, and the number of sample points to use} to a set of spatial coordinates which identifies the sample points in a cluster.
The pixel values of the sample points in cluster c00 are determined, and are used to define a pixel value histogram which defines a first frequency function Fp00c00. The frequency function partitions or otherwise divides the range of possible pixel values. For instance, suppose the image is a grayscale image with pixel values in the range from 0 to 255. The partition could use 256 intervals, with each interval containing a single pixel value. Alternatively, the partition could include eight intervals, each of which contains eight adjacent pixel values; a wide range of other interval definitions could also be used. The frequency function maps the intervals into a range of non-negative integers, with the integer for each interval representing the frequency of cluster sample points having a pixel value in that interval.
A second point p10 having coordinates (x1,y0) is selected. In the simplest case, x1 equals x0 plus an increment dx, but the coordinate values xn may also be chosen according to a nonlinear function, chosen randomly, or chosen in some other manner. A second cluster c10 of sample points about p10 is determined using the same distribution function and variance used with the first cluster. The second cluster c10 defines a second frequency function Fp10c10 in a manner similar to that described above.
The distance between the two frequency functions, which may be denoted symbolically as ∥Fp10c10xe2x88x92Fp00c00∥, is herein called a xe2x80x9cdensity differencexe2x80x9d. The density difference value depends on the points p10 and p00, on their respective sample point clusters c10 and c00, and on the metric used to measure the difference between the two frequency functions. One familiar metric which is suitable for use is defined as the square root of the integral over the partitioning intervals of the square of the absolute value of the difference between the two frequency function values on the corresponding intervals; this is the standard Lebesgue measure applied to frequency functions according to the invention. Other metrics may also be used; metrics generally and their properties such as positivity, symmetry, and satisfaction of the triangle inequality are familiar.
By way of analogy to numeric approximations of differentials obtained through techniques such as forward differencing, these density differences may be viewed as analogs of a differential or derivative of a frequency function. Given a two-dimensional data set such as an array of pixels, one may obtain density differences in both the X and Y directions; in N-dimensional spaces, one may obtain density differences in N directions. Using such frequency function partial derivative analogs, it is also possible to define a vector field whose elements are analogous to frequency function gradients. For convenience, these results are referred to hereafter simply as frequency derivatives and frequency gradients, respectively.
The behavior of frequency derivatives and/or frequency gradients can help indicate the presence and the nature of boundaries in the array of pixels or other underlying digital data set. For instance, zeros (or differences below a specified tolerance) in the frequency derivatives indicate there was relatively little change in the frequency function values, which in turn indicates relative uniformity of the corresponding pixel values. Alignment in gradient fields may also indicate boundaries.
To help characterize boundaries, two or more frequency derivatives may be obtained and then compared. Each frequency derivative may use a different number of sample points in clusters, a different increment between cluster centers, or a different variance in the distribution of cluster points, for example. In the analysis of tissue changes, the same spatial coordinates and cluster points may be used while the pixel values change; the boundary may be viewed as temporal rather than spatial, but the invention may be used to advantage nonetheless. Regardless of whether the boundaries are spatial or temporal, the degree of correlation between the frequency derivatives may be empirically associated with particular boundary characterizations.
The invention provides a boundary detection and characterization mechanism in any space of interest. The invention is not limited to images and pixels, or to digital data which represents a single snapshot in time. Any digital data set which contains regions of relative uniformity may be analyzed, regardless of whether the data was obtained analytically, empirically, through survey or sampling, computationally, or otherwise.