The invention relates to digital signal processing, and particularly to digital filtering.
Digital signals are often filtered to enhance signal components of certain characteristics, and to attenuate components of other characteristics. For example, in digital image processing, a xe2x80x9clow-passxe2x80x9d filter may be used to pass coarse image features, and to attenuate fine detail, texture, and noise. A xe2x80x9chigh-passxe2x80x9d filter may be used to enhance object boundaries, and to attenuate regions of nearly uniform signal intensity. Since signal features can occur over a wide range of characteristics, it is useful to be able to adjust the response of a filter over a wide range.
Digital filters are often implemented by a method in which each signal value is replaced by a weighted average of the signal value and a set of neighboring signal values. This method is known as xe2x80x9cconvolutionxe2x80x9d, the set of weights arranged in a particular pattern is known as a xe2x80x9ckernelxe2x80x9d, and the weights themselves, which can be positive, zero, or negative, are known as xe2x80x9ckernel elementsxe2x80x9d. Convolution is particularly important in digital image processing, where symmetrical, non-causal filters such as Gaussian and Laplacian approximations are common.
The size of a kernel is defined to be the size of the smallest region containing all of the kernel""s non-zero elements. The ability to adjust the response of a filter over a wide range is largely dependent on the ability to adjust the size of the corresponding kernel over a wide range, because adjusting the weights without adjusting the size has a relatively small effect on the response.
In the general case, convolution is expensive (i.e., requires excessive computational resources), particularly as kernel size increases, and particularly for 2D signals such as images, where computation cost is proportional to the square of the kernel size. The expensive nature of convolution-style filtering can be ameliorated by several methods used separately or in combination.
One method commonly used with image processors that support small, fixed kernel sizes involves performing the convolution multiple times, in effect cascading the corresponding filters to produce the effect of a filter having a much larger kernel. Although this method saves little or no time for 1D signals, significant time is saved for signals having two and higher dimensions. In general, this method works well only for filters that are approximately Gaussian, and is still too expensive for many practical applications.
Another method (i.e., Burt""s method as described in William M. Wells, III, Efficient Synthesis of Gaussian Filters by Cascaded Uniform Filters, IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, Vol. PAMI-8, No. 2 (March 1986) (xe2x80x9cWellsxe2x80x9d); P. J. Burt, xe2x80x9cFast, hierarchical correlations with Gaussian-like kernels,xe2x80x9d Comput. Vision Lab., Univ. Maryland, Tech. Rep. TR-860, January 1980; and P. J. Burt, xe2x80x9cFast algorithms for estimating local image properties,xe2x80x9d Comput. Vision, Graphics, Image Processing, vol. 21, pp. 368-382, March 1983, incorporated herein by reference) involves increasing the kernel""s size without increasing the computational cost, by inserting zero elements between a fixed number of non-zero elements. With this method, although the computational cost involved is reduced, in practice if a kernel is expanded over more than a small range, the quality of the filter""s output becomes unacceptable.
Many methods for implementing multidimensional filters take advantage of the fact that many important filters (including Gaussian and Laplacian approximations) are separable, which means that each dimension of the input signal can be processed separately with a 1D filter that corresponds to the multi-dimensional filter. This reduces the computational cost problem to one of finding inexpensive methods for 1D filtering. The methods described below are based on separable filters.
John F. Canny, Finding Edges and Lines in Images, Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Catalog Number AD-A130824 (June 1983) (xe2x80x9cCannyxe2x80x9d), incorporated herein by reference, describes a method in which a Gaussian filter is approximated by using recursive filters. The response of a recursive filter can be adjusted over a wide range without affecting the computational cost. Since recursive filters are causal and asymmetric, the Canny method applies them in forwards and backwards directions and sums the results to approximate the symmetric Gaussian. The Canny method also applies the filters twice to improve the quality of the approximation.
Methods for convolving a signal with a uniform kernel (also known as a xe2x80x9cboxcarxe2x80x9d kernel) have also been developed. The elements of a uniform kernel have uniform values within a region, and have values of zero outside this region. Uniform multidimensional kernels are separable, allowing for uniform 1D convolution that can be performed with a small, fixed number of operations where the small fixed number is independent of the size of the kernel.
Wells describes a method for approximating a Gaussian filter by repeated convolution with uniform kernels. According to Wells, a cascade of uniform filters approximates a Gaussian due to the central limit theorem. It is noted that the approximation improves as the number of repeated convolutions increases, and is good after three such convolutions.
A limitation of the prior art methods is that in order to reduce computational cost, considerable flexibility in choosing the shape of the filter is sacrificed. All of the above described methods in which computational cost is independent of kernel size involve Gaussian approximations or boxcar filters. Although Gaussian filters are important in image processing, and different sized Gaussian filters can be combined to approximate band-pass and high-pass filters, there are often practical reasons to prefer other shapes not readily possible using the prior art methods.
Another limitation is that even when Gaussian filtering is desired, the prior art methods are not well-suited to widely-available, inexpensive digital hardware, including specialized circuits and general purpose computers. Although some researchers have taken computational cost into account, typically this has been only in terms of arithmetic operations such as addition and multiplication. With practical digital hardware, particularly modern general purpose computers that have effectively single-clock-cycle multipliers, the number and pattern of memory accesses has a more significant effect on computational cost than the number of arithmetic operations.
For cost and performance reasons, a digital signal processor (including a general purpose computer being used for that purpose) typically provides at least three levels of memory. The lowest level is a small (typically fewer than 128 bytes) register file having an access computational cost of essentially zero. The next level is a medium-sized bank (typically a few tens of kilobytes) of static random access memory (xe2x80x9cSRAMxe2x80x9d) having an access computational cost about the same as the computational cost of an addition or multiplication, independent of the pattern of access. The final level is a large, dynamic random access memory (xe2x80x9cDRAMxe2x80x9d) in which source and destination images are held. Data is copied from DRAM to SRAM to be accessed by the computer""s processor. The access computational cost for this copying is typically equivalent to the SRAM access computational cost for long sequential access patterns, but is much higher for short or non-sequential patterns.
The Wells and Canny methods are computationally costly because they generate a significant quantity of intermediate data that is held in memory. For the 1D case, this data does not fit in the register file, so an access cost penalty equivalent to the arithmetic computational cost of an addition or multiplication operation is paid for each piece of data read or written. For separable 2D filtering, data along one dimension (i.e., rows) is sequential in DRAM, while data along the other dimension (i.e., columns) is not sequential, and therefore is slow to gain access to in sequence. To achieve high performance, it is necessary to gain access to and filter many neighboring columns at once, but the intermediate data generated quickly fills all available SRAM and results in significant copying between SRAM and DRAM.
Moreover, the Canny method requires many arithmetic operations, resulting in unacceptably high computational costs in some applications.
High speed industrial guidance and inspection applications, among others, would benefit from higher performance digital filtering executed on inexpensive digital hardware, and would also benefit from more flexibility in specifying the shape of digital filter kernels.
The invention provides an apparatus, a method, and computer software residing on a computer-readable storage medium, for digitally processing a one-dimensional digital signal. The invention includes convolving the one-dimensional digital signal with a function that is the (n+1)th difference (discrete derivative) of an nth order discrete piecewise polynomial kernel so as to provide a second one-dimensional digital signal. Further according to the invention, xe2x80x98nxe2x80x99 is at least 1, the polynomial kernel has a plurality of non-zero elements, the function has a plurality of non-zero elements and at least one zero element, and the function has fewer non-zero elements than the polynomial kernel has non-zero elements. Next, discrete integration is performed n+1 times on the second one-dimensional digital signal, thereby providing a digitally processed one-dimensional signal.
In preferred embodiments of the invention, convolving includes performing computations that involve only the non-zero elements of the function; the polynomial kernel includes at least one selectable parameter, each selectable parameter determining a respective property of the polynomial kernel; a selectable parameter determines size of the polynomial kernel; xe2x80x98nxe2x80x99 is a selectable parameter of the polynomial kernel; a selectable parameter determines the number of pieces of the polynomial kernel; all of the elements of the function are integers; xe2x80x98nxe2x80x99 has value 2; the function includes xe2x80x98sxe2x80x99 zero elements between each pair of neighboring non-zero elements, where xe2x80x98sxe2x80x99 is a positive integer; there are four non-zero elements of the function, these elements having respective values +1m, xe2x88x923m, +3m, xe2x88x921m, where xe2x80x98mxe2x80x99 is a non-zero integer; for at least one non-zero element of the function there are at least two zero elements; the at least one selectable parameter is a positive integer xe2x80x98sxe2x80x99, and the function includes first, second, third, and fourth non-zero elements having values +m(s+1), xe2x88x92m(s+3), +m(s+3), xe2x88x92m(s+1) respectively, where xe2x80x98mxe2x80x99 is a non-zero integer, and the function further includes xe2x80x98sxe2x80x99 zero elements between the second and third non-zero elements.
The invention also provides an apparatus, method, and computer software on a storage medium for digitally processing a multi-dimensional digital signal, such as an image. The invention includes, for at least one dimension of the multi-dimensional digital signal, convolving a corresponding one-dimensional digital signal with a corresponding function that is the (n+1)th difference of a corresponding nth order discrete piecewise polynomial kernel so as to provide a second multi-dimensional digital signal. Further according to the invention, xe2x80x98nxe2x80x99 is at least 1, each polynomial kernel has a plurality of non-zero elements, each function has a plurality of non-zero elements and a plurality of zero elements, and each function has fewer non-zero elements than the polynomial kernel has non-zero elements. Then, for each convolved dimension, the second multi-dimensional digital signal is discretely integrated n+1 times, where n is the order of the polynomial kernel corresponding to the dimension.
In preferred embodiments of this general aspect of the invention, convolving includes performing computations that involve only the non-zero elements of the corresponding function; each polynomial kernel includes at least one selectable parameter, each selectable parameter determining a respective property of the polynomial kernel; a selectable parameter determines size of each polynomial kernel; xe2x80x98nxe2x80x99 is a selectable parameter of each polynomial kernel corresponding to each dimension; a selectable parameter determines a number of pieces of each polynomial kernel along each dimension; all of the elements of each function are integers; xe2x80x98nxe2x80x99 has value 2; each function includes xe2x80x98sxe2x80x99 zero elements between each pair of neighboring non-zero elements, where xe2x80x98sxe2x80x99 is a positive integer corresponding to each dimension; there are four non-zero elements of each function, these elements having respective values +1m, xe2x88x923m, +3m, xe2x88x921m, where xe2x80x98mxe2x80x99 is a non-zero integer corresponding to each dimension; for at least one non-zero element of each function there are at least two zero elements; and the at least one selectable parameter is a positive integer xe2x80x98sxe2x80x99, and each function includes first, second, third, and fourth non-zero elements having values +m(s+1), xe2x88x92m(s+3), +m(s+3), xe2x88x92m(s+1) respectively, where xe2x80x98mxe2x80x99 is a non-zero integer, and each function further includes xe2x80x98sxe2x80x99 zero elements between the second and third non-zero elements.
Among the advantages of the invention are one or more of the following. The invention allows one dimensional digital filtering and separable multidimensional digital filtering. A variety of basic kernel shapes can be used. For any such basic kernel shape, the kernel size, and consequently the response of the filter, can be adjusted over a wide range at a computational cost that is substantially independent of the size of the kernel. The amount of intermediate data generated by the filter is small and is substantially independent of the size of the kernel, resulting in highly efficient use of conventional memory architectures. Gaussian approximations equivalent to those resulting from cascaded uniform filters can be produced, but at a computational cost that is lower when the memory access cost is properly accounted for.
Other advantages and features will become apparent from the following descriptions, and from the claims.