The present invention is directed to methods and apparatus for rapid and efficient approximation of magnitude spectra, for uses such as speech and other pattern recognition and processing, as well as more generalized two-dimensional and three-dimensional Euclidean vector (magnitude and/or phase) approximation for coordinate transforms and other applications.
Magnitude spectra, including frequency domain intensity and power spectra, are useful for determining and processing the frequency components of a time or spatial domain signal, such as may be produced by Fourier or other complex number transformation of spatial and/or time-based data. Both the Fourier transformation and the magnitude spectrum calculation can cause a heavy computational load on a host microprocessor or digital signal processor (DSP). Fourier transformation is conventionally carried out by special or general purpose electronic computer or optical system in rectangular form, in which time-domain data is transformed to the frequency domain [x(t)]. The Fourier transform is a complex number transform producing an array of complex numbers each defined as Re+iIm, where Re is referred to as the real part of the frequency domain, i is the imaginary number, i={square root over (xe2x88x921)}, and Im is referred to as the imaginary part of the frequency domain, as follows:                     X        d            ⁡              (        k        )              =                  1        N            ⁢                                    ∑                          N              -              1                                            n            -            0                          ⁢                              x            ⁡                          (              n              )                                ⁢                      ⅇ                                          -                ⅈ2π                            ⁢                              xe2x80x83                            ⁢                              kn                /                N                                                          or                    X        d            ⁡              (        k        )              =                  1        N            ⁢                                    ∑                          N              -              1                                            n            -            0                          ⁢                              x            ⁡                          (              n              )                                ⁢                      (                                                            2                  ⁢                  π                  ⁢                                      xe2x80x83                                    ⁢                  kn                                N                            -                              ⅈsin                ⁢                                  xe2x80x83                                ⁢                                                      2                    ⁢                    π                    ⁢                                          xe2x80x83                                        ⁢                    kn                                    N                                                      )                              
Together, these two complex number parts form an array of complex-valued pairs [Re, Im], in which the complex quantities are typically represented as vectors in rectangular (Cartesian) coordinates. The rectangular form may be converted into a polar form also having a function of two parts, magnitude (M) and phase (xcex8), which can be represented as a rotating vector in polar coordinates. The magnitude component M is transformed from the real and imaginary parts of the rectangular form by calculating the square root of the sum of squares, such that M={square root over (Re2+Im2)}. The phase component xcex8 is similarly transformed as the arctangent of the imaginary part, Im, divided by the real part, Re, such that xcex8=Arctan (Im/Re). Magnitude calculation is also utilized in Euclidean distance determination and vector calculations, and generally in coordinate transformations of rectangular to polar, spherical and cylindrical coordinate systems. In such transformations, phase angle information is an important component of the complete coordinate transform, together with the magnitude or Euclidean distance.
Direct floating/fixed point implementation of the magnitude spectrum calculation is conventionally carried out in appropriately programmed general or special purpose computer systems. In such direct calculations, the real and imaginary parts of each spectral component may be squared and then summed, and finally xe2x80x9csquare rootedxe2x80x9d. However, direct calculation of the magnitude spectra is relatively slow, and computationally intense, for applications and systems requiring large data throughputs such as those for speech recognition and image analysis or generation. For example, in a typical general purpose computer chip such as a Pentium II(copyright) a squaring operation may take 3 times as long as any arithmetic or logical operation. On a Motorola M*Core(trademark) Risc microprocessor, 2 bits of a multiply are resolved per clock cycle, such that conventional 16 bit precision multiplication would typically be performed over 8 clock cycles, while an arithmetic or logical operation executes in a single clock cycle (xe2x80x9carithmeticxe2x80x9d as used herein means primitive register functions add/subtract/increment/decrement or the like and xe2x80x9clogicalxe2x80x9d means primitive logic operations not/or/and/shift_left/shift_right/xor or the like). Numerical squaring operations in general purpose or dedicated computer systems are computationally more time-consuming and require significantly higher hardware capability and capacity than addition operations. Squaring of data can produce relatively large numbers, requiring increased hardware system precision, and will require registers larger than would be needed for primitive addition or subtraction operations. Conventional square root extraction is significantly more computationally intensive for computer apparatus than multiplication (squaring) of numbers. Although a variety of computational methods may be used, square root determination using a conventional computer chip such as Pentium III(copyright) or the M*CORE(copyright) microprocessors may take 15 times as long as an arithmetic operation. In order to speed up the square root operation, direct computation may be replaced by table operation, in which pre-calculated, tableized, fixed-point data array is used to generate output. In this case a table of square roots is first generated for the anticipated range of the expression (Re2+Im2 ). Then the Re, Im number pair is used to address the values in the table, to select a precalculated result for that number pair. However, in order to provide significant accuracy for a range of Re, Im values, the table should be relatively large, which requires significant system memory. A 1024xc3x971024 table to produce 8 bit output precision may typically require about 1 MB of system memory, while 16 bit precision over the same table requires 2 MB memory. However if finer resolution is needed in the table, such as 12 bit by 12 bit lookup which is equivalent to a 4096 by 4096 table, 8 bit output precision requires 16 MB of memory, and 16 bit output precision requires about 32 MB of memory. A power series approximation, such as a Taylor expansion requiring numerous multiplications, may also be used to calculate magnitude spectra, but this is also computationally more time-consuming and hardware-intensive than simple addition and subtraction operations.
Methods and apparatus for effectively carrying out speech recognition and image processing or generation in xe2x80x9creal timexe2x80x9d typically require processing of large amounts of voice or image data, typically including magnitude transform determination. For example, U.S. Pat. No. 5,960,394 to Gould, et al. (Dragon Systems, Inc.), U.S. Pat. No. 5,749,066 to Nussbaum (Ericsson Messaging Systems Inc.), U.S. Pat. No. 5,890,103 to Carus (Lernout and Haupsie Speech Products N.C.), U.S. Pat. No. 5,640,485 to Ranta (Nokia Mobile Phones, Ltd.), U.S. Pat. No. 4,956,865 to Lennig, et al. (Northern Telecom Limited), U.S. Pat. No. 4,283,601 to Nakijima, et al. (Hitachi, Ltd.), U.S. Pat. No. 5,583,961 to Pawlewski, et al. (British Telecommunications), U.S. Pat. No. 5,465,318 to Sejnoba (Kurzweil Applied Intelligence, Inc.), U.S. Pat. No. 5,054,074 to Bakis (IBM) and the references cited therein (which are incorporated by reference herein), describe a wide variety of known speech recognition and speech processing systems which illustrate the computational intensity of such systems.
Improved computational equipment and methods would be desirable, particularly for low cost and portable systems which efficiently and effectively carry out magnitude transforms for such applications. Accordingly, particularly for portable, low power systems such as cellular telephones, and other hand held or portable devices and appliances which utilize voice and/or image processing, there is a need for efficient new, relatively inexpensive systems for approximating the magnitude of complex numbers.
It is an object of the present invention to provide methods and computational apparatus which can efficiently and effectively approximate magnitude spectra for voice recognition and similar uses.
Separate and alternative objects to provide methods and equipment for rapidly and efficiently approximating phase of complex numerical data, for transforming data from cartesian (rectilinear) coordinate system representation to polar, cylindrical or spherical coordinate systems representation, and/or for determining Euclidean distance between points in two or three dimensional cartesian coordinate space. These and other objects will be apparent from the following specification and the accompanying drawings.
In accordance with the present invention, magnitude spectra are efficiently determined by piecewise linear approximations of a quadratic function utilized for the approximation. The approximation of magnitude, or Euclidean distance, may also include the approximation of phase of the magnitude vector. Various aspects of the present invention can be implemented in either hardware or software, or both.
One aspect of the present invention is directed to methods for approximating the magnitude M of a number pair [Re, Im], which is defined as M={square root over (Re2+Im2)}. The number pair [Re, Im] can also be any Cartesian number pair for which it is desired to approximate the Euclidean distance or vector magnitude. A graphical representation showing the concave complexity of the magnitude M as a function of Re and Im is shown in FIG. 1, which is a perspective view of the function centered at Re=0, Im=0. The rotational symmetry may be used to simplify the approximation process. In order to approximate this magnitude function (or Euclidean distance) in accordance with the present invention, its symmetries are utilized to simplify the approximation, and to reduce its complexity. In this regard, a ratio of the Re and Im values may be selected which is always in the interval from 0 to 1. This can be done by defining a fraction xcex2 in which the largest absolute value of [Re, Im] is selected as the denominator, and the smallest absolute value of [Re, Im] is selected as the numerator. In this way, the symmetries are used to reduce complexity. In carrying out this step as defined herein, if Re and Im are equal, either may be selected as the maximum value Max, and as the minimum value, Min. When both Re and Im are zero, the magnitude, M, is zero.
A flow chart for an example of a magnitude calculation method is shown in FIG. 2 for processing arrays of Real and Imaginary input values 202. Typically, the Re, Im values will be stored in memory arrays representing a spectrum of values, such as those produced by a Fourier, Mellin, Laplace or other complex transform. The process illustrated in FIG. 2 may be repeated for each of the pairs of values in the arrays, until the magnitude of the entire spectrum of values in the arrays has been approximated. In this regard, as shown in FIG. 2, respective sets of Real number Re 204 and Imaginary numbers Im 206, are processed by absolute value determination to produce the respective absolute values 208, 210. Absolute value determination is relatively simple in binary computer equipment typically involving only dropping any negative number sign.
As shown in FIG. 2, the present disclosure also involves effectively selecting one or more intervals 212 for linear approximation. In this regard, an interval effectively determined by the value of the ratio xcex2=Min./Max is selected to determine which piecewise linear approximation of M to use. While the ratio xcex2 can be calculated conventionally, it is advantageous that intervals along xcex2 may be selected without such direct calculations saving the time and equipment requirements of such calculation. Intervals along the ratio xcex2 from 0 to 1 can be selected by binary (xe2x80x9cpower of 2xe2x80x9d) shifting and comparison.
Also in accordance with the example of FIG. 2, coefficients for a linear approximation over a curve substantially defined by the function {square root over (1+xcex22)}, where xcex2=Min/Max, or, xcex2=min[abs(Re, Im)]/max[abs(Re,Im)], are then selected in a range of xcex2 from 0 to 1, where xe2x80x9cMinxe2x80x9d and xe2x80x9cMaxxe2x80x9d respectively refer to selection of the minimum and maximum values from the specified list. These coefficients represent different piecewise approximations of the curve over each different interval along xcex2. The intervals may be equal or unequal in length along xcex2. Desirably, the intervals will be selected so that they may be readily determined in a binary number system, as will be described more fully in the following Detailed Description of the Invention. FIG. 3A is a graph showing as curve 302, the function {square root over (1+xcex22)}, with the abscissa 304 being the variable xcex2=Min/Max, and the ordinate 306 accordingly scaled to 1/Max. In FIG. 3A, two linear approximation intervals 308, 310 are selected. The first interval 308 extends from xcex2=0 to xcex2=0.5, while the second interval 310 extends from xcex2=0.5 to xcex2=1.0. FIG. 3A also includes an error plot showing the relatively high degree of accuracy of the 2-interval approximation. FIG. 3B, discussed subsequently, illustrates a 4-interval linear magnitude approximation, together with an error plot showing a higher accuracy than the 2-interval approximation of FIG. 3A.
A linear approximation of the curve {square root over (1+xcex22)} is established having a slope as a function of the minimum value, Min, over each respective interval. In this regard, a first linear approximation 312 may be selected which is preferably effectively a xe2x80x9cbest fitxe2x80x9d of the curve 302 along the first interval 308 which is defined by a slope S1 and an ordinate intercept, K1. A second linear approximation 314 may similarly be selected which is effectively a xe2x80x9cbest fitxe2x80x9d of the curve 302 along a second interval 310, where xcex2 ranges from 0.5 to 1.0. The second linear approximation 314 has a different slope S2, and a different ordinate intercept, K2. Preferably, the linear approximation will be selected to substantially minimize the difference between the actual value of {square root over (1+xcex22)}, and the linear approximation value, over the selected interval, and/or to minimize the calculation time. In this regard, extended precision of a xe2x80x9cprecisexe2x80x9d slope or intercept value may not warrant the increased number of additions/subtractions necessary to produce this accuracy. A xe2x80x9cleast squaresxe2x80x9d minimization is a preferred xe2x80x9cbest fitxe2x80x9d approach, but other error minimization approaches may also be used.
Because the ordinate of the curve 302 is expressed in units of 1/Max, and the abscissa xcex2 is expressed in units of Min/Max, having established the linear approximations in the respective selected intervals, both the ordinate and abscissa may be multiplied by Max to give the magnitude M directly. The specific linear approximation interval of xcex2 defined by the Min/Max pair ratio is then selected, and the linear approximation of the curve {square root over (1+xcex22)} using the selected linear approximation in the selected interval is calculated using the slope S defined predominantly by the value Min and using the intercept K, defined predominantly by the value of Max to obtain an approximation of the magnitude M directly. Having established a linear approximation of the curve {square root over (1+xcex22)} over a selected interval in xcex2, that linear approximation may be used to directly calculate the magnitude M according to the formula M=S*Min+K*Max (where xe2x80x9c*xe2x80x9d is an effective multiply symbol). For computational purposes, it may be desirable to separate the ordinate intercept K in two constants, or shift the computational origin, which is mathematically equivalent, but can have practical computational benefits. Accordingly, as shown on FIG. 2, the computational constants 214 are represented by alpha0, alpha 1 and alpha2, with M={square root over (Re2+Im2)}. The process can be repeated for each of the number pairs 202, to produce corresponding magnitude approximation outputs 216. Accordingly, as shown in FIG. 2, the appropriate coefficients alpha0, alpha1 and alpha2 for the linear approximation over the selected internal in xcex2 may be selected and used to rapidly and efficiently calculate an approximation of the magnitude M of the number pairs, Re, Im. This calculation can be carried out as a conventional multiplication, but has an important advantage that it may also readily and efficiently be carried out by a relative few, simple, one clock cycle binary shifts and addition/subtraction operations (hereinafter xe2x80x9csimple arithlogicxe2x80x9d operations), as will be described in more detail in the Detailed Description of the Invention.
The present invention is also generally directed to methods and apparatus for efficiently approximating the phase angle xcex8, or the Arctangent, of a number pair ratio, such as Im/Re. Like the calculation of the square root of the sum of squared numbers, the Arctangent of a number ratio is a nonlinear function which can vary widely (from zero to plus or minus infinity) over broad ranges of the ratio, and its exact calculation is time-consuming and equipment-intensive. A graphical illustration of the value xcex8=Arctan Im/Re centered at Re=0, Im=0, is shown in FIG. 6 where xcex8 is in radians, which illustrates in part the complexity of the function. However, by transformation of variables, and careful use of symmetry, rapid and efficient approximation can be carried out using linear approximation methods similar to those previously described for efficient magnitude approximations. In this regard, the trigonometric calculation of the Arctangent function can be simplified by transformation of variables to ratios xcex2 of [Re, Im] which are in the range of 0 to 1 by selecting the largest absolute value Max of the number pair, [Re, Im], as the denominator, and the smallest absolute value Min of the number pair Re, Im as the numerator of the fraction xcex2 for the arctangent calculation, where xcex2=min[abs(Re, Im)]/max[abs(Re,Im)], or xcex2=Min/Max. This limits the arctangent phase calculation to the 0-45xc2x0 zone of the Arctangent function, as shown in FIG. 7, where xcex2 is necessarily in the range 0 to 1, as previously discussed. For the zone 45xc2x0-90xc2x0, the trigonometric relationship Arctangent (Min/Max)=Arccotangent (Max/Min)=90xc2x0xe2x88x92Arctangent (Min/Max) may be utilized. Thus, by tracking the sign (+ or xe2x88x92) and relative size of the Re and Im variables, the full 360xc2x0 range of Arctangent (Im, Re) may be rapidly and efficiently approximated.
To approximate the phase xcex8, one or more linear approximation intervals may be selected over a curve substantially defined by the function Arctangent xcex2, where xcex2=Min/Max, or, xcex2=min[abs(Re, Im)]/max[abs(Re,Im)], in a range of xcex2 from 0 to 1. The intervals may be equal or unequal in length, as previously described. Desirably, the intervals will be selected so that they may be readily determined in a binary number system, as will be disclosed in more detail in the following Detailed Description of the Invention. In this regard, FIG. 8A is a graph showing as curve 802, the function Arctangent (xcex2) with the abscissa 804 being the variable xcex2=Min/Max, and the ordinate 806 scaled to either degrees, or radians, as may be desired over the range 0-45xc2x0, or 0-xcfx80/4 radians. In the example of FIG. 8A, two linear approximation intervals 808, 810 are selected. The first interval 808 extends from xcex2=0 to xcex2=0.5, while the second interval 810 extends from xcex2=0.5 to xcex2=1.0. A linear approximation having a slope as a function of the minimum value, Min, of the curve Arctangent xcex2 is established over each respective interval 808, 810. In this regard, a first linear approximation 812 may preferably be selected which is an effective xe2x80x9cbest fitxe2x80x9d of the curve 802 along the first interval 808 where xcex2 ranges from 0 to 0.5. The first linear approximation interval is defined by a slope S1 and an ordinate intercept K1. A second linear approximation 814 is selected which is effectively a xe2x80x9cbest fitxe2x80x9d of the curve 802 along the second interval 810 where xcex2 ranges from 0.5 to 1.0. The second linear approximation has a different slope S2, and a different ordinate intercept K2, as shown in FIG. 8A. Preferably, the linear approximations will be selected to substantially minimize the difference between the actual value of Arctangent xcex2, and the linear approximation value, over the selected interval, and/or to minimize the calculation time. Again, a substantially least-squares best fit may be made over the interval, but other best-fit approximations may also be used.
As discussed previously, the ordinate of the curve 802 is expressed in units of degrees or radians, and the abscissa xcex2 is expressed in units of xcex2=Min/Max. Having established the linear approximation in the selected intervals, both the ordinate and the abscissa may be multiplied by Max to simplify the slope and intercept calculation. The specific linear approximation interval of xcex2 defined by the Min/Max pair is selected, and the selected linear approximation 812 or 814 is calculated to obtain an approximation of the phase xcex8, according to the formula Max*xcex8=S*Min+K*Max, where S is a slope and K is an ordinate intercept constant. This calculation can be readily and efficiently carried out by simple binary shifts and addition/subtraction operations. In order to obtain the phase angle xcex8 value, the function Maxxc3x97xcex8 may be xe2x80x9cdividedxe2x80x9d by Max (the largest value of the Re, Im pair). Because divide operations are time intensive, such division may be more effectively carried out in binary form by multiplying by the binary inverse, 1/Max, in accordance with reciprocal multiplication practice. The value xcex8*Max/Max may accordingly be calculated by determining the reciprocal 2R/Max where R is the position of the binary decimal point (radix point), multiplying (xcex8*Max) by the reciprocal 1/Max, and, shifting right R binary positions. Partial binary inverse multiplication may be used to select intervals over the range xcex2, as previously discussed. Fixed integer multiplication may be carried out with a combination of conventional, simple, fast instructions such as SHL, ADD, SUB and LEA. This division approximation method may introduce rounding errors (which can be minimized or eliminated in accordance with conventional practice).
A four-interval linear approximation of the Arctangent phase function is shown in FIG. 8B, normalized to xcex2[0.00-1.0]. The intervals are 0-0.25, 0.25-0.5, 0.5-0.75, and 0.75-1.0, and the respective linear approximations so closely approximate the arctan function 860 that the difference 862 are difficult to represent in the FIGURE, as confirmed by the actual plot of difference shown in registration therewith below the arctan curve.
This invention accordingly allows for efficient calculation of magnitude (power) spectra by eliminating the need for conventional square root operation, and extensive conventional multiplies (which typically involve extended precision) used during the conventional implementation of a magnitude spectrum or coordinate transform calculation. In preferred embodiments of magnitude approximation, the removal of explicit multiplication also has a number of important advantages. Improved computational efficiency can be provided on some processors while the second is the lack of need for extended precision. A multiply of N bits by N bits generates a result of 2N bits which must be temporarily stored in a register, whereas preferred embodiments of the present invention generally do not need more than N+1 bits of precision at any stage of calculation. The present methods can be implemented in hardware or software, as indicated, and in either fixed or floating-point numeric formats.
In accordance with the present invention, power spectra can be rapidly and efficiently calculated for uses in portable speech recognition systems, and other uses which may particularly benefit from small, portable or low power equipment.