1. Field of the Invention
The current invention relates to digital signal processors (DSPs), and in particular, DSPs having multiple arithmetic logic units (ALUs).
2. Description of the Related Art
Conventional telephone keypads generate Dual Tone Multi Frequency (DTMF) signals when pressed. Pressing any particular key on a telephone keypad produces a unique combination of two tones, where a low tone represents the row of the key on the keypad and a high tone represents the column of the key on the keypad. The frequencies of the tones range from about 697 Hz to about 1633 Hz. Note that column frequency 1633 represents keys A, B, C, and D, where these keys, although part of DTMF signaling, are absent from many conventional telephone keypads. It should be noted that communication systems may produce other tone combinations comprising the same or a different number of tones at different frequencies. Corresponding communication equipment, such as equipment at a telephone company's central office that connects with the telephone, may need to be able to detect the presence of particular tones in a sample of audio content from the telephone.
Audio content in conventional modern telephone systems is usually digitized at a sampling rate of 8 kHz for processing and transmission by the telephone service provider(s). It should be noted that other sampling rates are possible. The digitized audio content is typically processed as data frames where each frame represents a window of time. Typical data frame lengths are 5 ms and 10 ms, which, at an 8 kHz sampling rate, are equivalent to 40 and 80 samples, respectively. A tone-detection decision is typically made once per frame. Several methods are known in the prior art for determining whether a frame contains audio content at a particular frequency.
The power P(F) of the input signal at a given frequency F in an N-sample data frame can be determined using the formula of Equation (1) below:
                                          P            ⁡                          (              F              )                                =                                                                                    ∑                                      k                    =                    1                                    N                                ⁢                                                      x                    k                                    ⁢                                      ⅇ                                                                  -                        j                                            ⁢                                                                        2                          ⁢                          π                          ⁢                                                                                                          ⁢                          F                                                                          F                          s                                                                    ⁢                                              (                                                  N                          -                          k                                                )                                                                                                                                2                          ,                            (        1        )            where x1, x2, . . . , xN are the samples of the frame, j is the square root of −1, and Fs is the sampling frequency for the frame. The samples x1, x2, . . . , xN represent voltage values of an electrical signal that represents corresponding sound pressure levels of an audio signal. Once the value of power P(F) is determined for the particular frame, that value is compared to a threshold and the result of the comparison is used in determining whether a tone at the given frequency has been detected for that frame.
Power P(F) for an N-sample frame can be calculated iteratively or recursively using the algorithm of Equation (2) below:
                                              ⁢                                            Re              ⁡                              (                0                )                                      =                                          Im                ⁡                                  (                  0                  )                                            =              0                                ⁢                                          ⁢                                          ⁢                                                    Re                ⁡                                  (                  n                  )                                            =                                                x                  n                                +                                                      cos                    ⁡                                          (                                                                        2                          ⁢                          π                          ⁢                                                                                                          ⁢                          F                                                                          F                          s                                                                    )                                                        ·                                      Re                    ⁡                                          (                                              n                        -                        1                                            )                                                                      -                                                      sin                    ⁡                                          (                                                                        2                          ⁢                          π                          ⁢                                                                                                          ⁢                          F                                                                          F                          s                                                                    )                                                        ·                                      Im                    ⁡                                          (                                              n                        -                        1                                            )                                                                                            ,                                                  ⁢                                                  ⁢            and                                              (        2.1        )                                                          ⁢                                            Im              ⁡                              (                n                )                                      =                                                            sin                  ⁡                                      (                                                                  2                        ⁢                        π                        ⁢                                                                                                  ⁢                        F                                                                    F                        s                                                              )                                                  ·                                  Re                  ⁡                                      (                                          n                      -                      1                                        )                                                              +                                                cos                  ⁡                                      (                                                                  2                        ⁢                        π                        ⁢                                                                                                  ⁢                        F                                                                    F                        s                                                              )                                                  ·                                  Im                  ⁡                                      (                                          n                      -                      1                                        )                                                                                ⁢                                          ⁢                                          ⁢                                                    for                ⁢                                                                  ⁢                n                            =              1                        ,            2            ,            …            ⁢                                                  ,            N                                              (        2.2        )                                                          ⁢                              P            ⁡                          (              F              )                                =                                                    Re                ⁡                                  (                  N                  )                                            2                        +                                          Im                ⁡                                  (                  N                  )                                            2                                                          (        2.3        )            Equation (2) calls for calculating values for Re(n) and Im(n) for each sample of the frame. The calculations for each sample are based on the values of xn, Re(n−1), and Im(n−1) (where Re(0) and Im(0) are 0). Power P(F) is then calculated for the N-sample frame based on Re(N) and Im(N). It should be noted that, generally, a recursive calculation would involve implementing the calculation using a procedure that calls itself repeatedly (e.g., N times) until some condition is met, while an iterative calculation would involve implementing the calculation using a procedure that includes an explicit instruction loop that is executed a certain number (e.g., N) of times. Thus, to illustrate a recursive function, a recursive pseudo-code implementation for a factorial function could be factorial(n) where if n=0 then return 1 else return n·factorial(n−1). Similarly, to illustrate an iterative function, an iterative pseudo-code implementation for a factorial function could be factorial(n) where temp=1; for i=2 to n, temp=temp·i; return temp. Since, generally, recursive algorithms can be transformed into corresponding iterative algorithms and vice-versa, the terms, as used herein, unless otherwise indicated, are interchangeable.
Another iterative or recursive way to calculate power P(F) for an N-sample frame involves using the Goertzel algorithm, as shown in Equation (3) below:
                              S          ⁡                      (            0            )                          =                              S            ⁡                          (                              -                1                            )                                =          0                                    (        3.1        )                                                      S            ⁡                          (              n              )                                =                                    x              n                        +                          2              ⁢                                                cos                  ⁡                                      (                                                                  2                        ⁢                        π                        ⁢                                                                                                  ⁢                        F                                                                    F                        s                                                              )                                                  ·                                  S                  ⁡                                      (                                          n                      -                      1                                        )                                                                        -                          S              ⁡                              (                                  n                  -                  2                                )                                                    ⁢                                  ⁢                                            for              ⁢                                                          ⁢              n                        =            1                    ,          2          ,          …          ⁢                                          ,          N                                    (        3.2        )                                          P          ⁡                      (            F            )                          =                                            S              ⁡                              (                N                )                                      2                    -                      2            ⁢                                          cos                ⁡                                  (                                                            2                      ⁢                      π                      ⁢                                                                                          ⁢                      F                                                              F                      s                                                        )                                            ·                              S                ⁡                                  (                  N                  )                                            ·                              S                ⁡                                  (                                      N                    -                    1                                    )                                                              +                                    S              ⁡                              (                                  N                  -                  1                                )                                      2                                              (        3.3        )            The Goertzel algorithm involves calculating an N-item series of values from S(1) to S(N) for the samples x1, x2, . . . , xN of the frame, where each S(n) value is based on xn, S(n−1), and S(n−2), and where S(0) and S(−1) are 0. Power P(F) for the N-sample frame is then calculated based on the last two values of the series, i.e., S(N) and S(N−1).
When any of the above calculations are performed by a processor, such as an Application-Specific Integrated Circuit (ASIC) or a Digital Signal Processor (DSP), slight modifications may be made to the formulas to account for the limitations of the fixed-point arithmetic that may be used by those processors. For example, a saturation function may be used to implement saturation arithmetic where results of arithmetic operations, which may otherwise overflow, are clamped between a maximum value and a minimum value. Saturation may also be used in rounding off numbers, such as, for example, in converting a 32-bit fixed-point number into a 16-bit fixed-point number. 32-bit fixed-point numbers are also known as Q31- or Q1.31-format numbers, where the 31 represents the number of bits after the binary point (i.e., the binary equivalent of a decimal point) and the 1, when present, represents the number of bits before the binary point. It should be noted that, in general, if no number is present before the binary point (e.g., Q31), it is assumed that “1” is intended there (i.e., Q1.31). Similarly, 16-bit fixed-point numbers are known as Q15-format or Q1.15-format numbers. As used herein, unless otherwise noted, references to Qc.15 and Qc.31 formats indicate generic format references that include formats with zero or more bits before the binary point. Thus, for example, the term Qc.31 format includes Q0.31, Q1.31, Q2.31, etc. formats.
In some implementations of a saturation function, the saturation function merely discards the least significant bits of the saturated number. This can cause round-off errors which may be corrected using an additive correction. Thus, if, for example, a saturation function SAT [a] discards the 16 least significant bits when saturating 32-bit number a to 16-bit number a′, then an additive correction of 2−16 may be used so that SAT[a+2−16] functions like a rounding-off function round[a] for rounding off a to a 16-bit number. An illustrative decimal example may be helpful to understand how this works. Suppose that the decimal-number function SAT[a] discards the digits after the decimal point of a. Thus, SAT[5.5] would result in 5, while SAT[4.99] would result in 4. Using an additive correction of 0.5, SAT[a+0.5] can be used as a rounding-off function where, for example, (a) SAT[5.5+0.5]=SAT[6.0]=6=round(5.5) and (b) SAT[4.99+0.5]=SAT[5.49]=5=round[4.99].
Equation (2) can be modified to accommodate the above-described saturation and additive correction, and also incorporate a normalization factor, as shown in Equation (4) below:
                                              ⁢                                            Re              ⁡                              (                0                )                                      =                                          Im                ⁡                                  (                  0                  )                                            =              0                                ⁢                                          ⁢                                                    Re                ⁡                                  (                  n                  )                                            =                              S                ⁢                                                                  ⁢                A                ⁢                                                                  ⁢                                  T                  ⁡                                      [                                                                  2                                                  -                          16                                                                    +                                              M                        ·                                                  x                          n                                                                    +                                                                        cos                          ⁡                                                      (                                                                                          2                                ⁢                                π                                ⁢                                                                                                                                  ⁢                                F                                                                                            F                                s                                                                                      )                                                                          ·                                                  Re                          ⁡                                                      (                                                          n                              -                              1                                                        )                                                                                              -                                                                        sin                          ⁡                                                      (                                                                                          2                                ⁢                                π                                ⁢                                                                                                                                  ⁢                                F                                                                                            F                                s                                                                                      )                                                                          ·                                                  Im                          ⁡                                                      (                                                          n                              -                              1                                                        )                                                                                                                ]                                                                        ,                                                  ⁢                                                  ⁢            and                                              (        4.1        )                                                      Im            ⁡                          (              n              )                                =                      S            ⁢                                                  ⁢            A            ⁢                                                  ⁢                          T              ⁡                              [                                                      2                                          -                      16                                                        +                                                            sin                      ⁡                                              (                                                                              2                            ⁢                            π                            ⁢                                                                                                                  ⁢                            F                                                                                F                            s                                                                          )                                                              ·                                          Re                      ⁡                                              (                                                  n                          -                          1                                                )                                                                              +                                                            cos                      ⁡                                              (                                                                              2                            ⁢                            π                            ⁢                                                                                                                  ⁢                            F                                                                                F                            s                                                                          )                                                              ·                                          Im                      ⁡                                              (                                                  n                          -                          1                                                )                                                                                            ]                                                    ⁢                                  ⁢                                  ⁢                                            for              ⁢                                                          ⁢              n                        =            1                    ,          2          ,          …          ⁢                                          ,          N                                    (        4.2        )                                                          ⁢                              P            ⁡                          (              F              )                                =                      S            ⁢                                                  ⁢            A            ⁢                                                  ⁢                          T              ⁡                              [                                                      2                                          -                      16                                                        +                                                            Re                      ⁡                                              (                        N                        )                                                              2                                    +                                                            Im                      ⁡                                              (                        N                        )                                                              2                                                  ]                                                                        (        4.3        )            where M is a pre-calculated normalization factor, SAT[ ] represents a saturation function for truncating Q c.31 numbers to Q c.15 format, and 2−16 is an additive correction factor to make saturation function SAT[ ] operate like a rounding-off function. Normalization is a process of adjusting data points in order to have them fit some particular rule, and is commonly used in signal processing. Note that, in a recursive implementation, the calculation would start with trying to determine Re(N) and Im(N), which would involve determining Re(N−1) and Im(N−1), which would in turn involve determining Re(N−2) and Im(N−2), and so forth down to Re(0) and Im(0). In contrast, an iterative implementation would involve first calculating Re(1) and Im(1) and then using the results to calculate Re(2) and Im(2), and so forth up to Re(N) and Im(N).
A conventional DSP would require 3N+O(1) clock cycles to calculate power P(F) using Equation (4). It should be noted that O(1) is in “big O” notation and represents a function bound by a constant and not dependent on N. Thus, for a 40-sample data frame (i.e., N=40) assuming, for example, O(1)=10, a conventional DSP would require 130 clock cycles to calculate power P(F) for the data frame using Equation (4). The DSP requires 2 clock cycles to perform multiplication operations, including multiply-and-accumulate (MAC) operations and multiply-and-subtract (MSU) operations. A MAC instruction operates such that MAC (a, b, c) adds the product of a and b to c, i.e., c=c+a·b. An MSU instruction operates such that MSU (a, b, c) subtracts the product of a and b from c, i.e., c=c−a·b. Since cos(2πF/Fs) and sin(2πF/Fs) are constants, they can be pre-calculated once and stored for use by each iteration. Each iteration of Equation (4.2) requires several MAC and MSU operations that take 2 clock cycles per iteration. Saturation requires another clock cycle per iteration. Equation (4.2) could be modified to remove the saturation function. This will make the procedure unstable and therefore liable to overflow and provide erroneous results, but would also reduce the number of cycles required for processing an N-sample frame to 2N+O(1) clock cycles.
Equation (3) can also be modified to accommodate the above-described saturation, additive correction, and normalization, as shown in Equation (5) below:
                                              ⁢                              S            ⁡                          (              0              )                                =                                    S              ⁡                              (                                  -                  1                                )                                      =            0                                              (        5.1        )                                                          ⁢                                            S              ⁡                              (                n                )                                      =                          S              ⁢                                                          ⁢              A              ⁢                                                          ⁢                              T                ⁡                                  [                                                            2                                              -                        16                                                              +                                          M                      ·                                              x                        n                                                              +                                          2                      ⁢                                                                        cos                          ⁡                                                      (                                                                                          2                                ⁢                                π                                ⁢                                                                                                                                  ⁢                                F                                                                                            F                                s                                                                                      )                                                                          ·                                                  S                          ⁡                                                      (                                                          n                              -                              1                                                        )                                                                                                                -                                          S                      ⁡                                              (                                                  n                          -                          2                                                )                                                                              ]                                                              ⁢                                          ⁢                                          ⁢                                                    for                ⁢                                                                  ⁢                n                            =              1                        ,            2            ,            …            ⁢                                                  ,            N                                              (        5.2        )                                          P          ⁡                      (            F            )                          =                  S          ⁢                                          ⁢          A          ⁢                                          ⁢                      T            ⁡                          [                                                2                                      -                    16                                                  +                                                      S                    ⁡                                          (                      N                      )                                                        2                                -                                  2                  ⁢                                                                          ⁢                                                            cos                      ⁡                                              (                                                                              2                            ⁢                            π                            ⁢                                                                                                                  ⁢                            F                                                                                F                            s                                                                          )                                                              ·                                          S                      ⁡                                              (                        N                        )                                                              ·                                          S                      ⁡                                              (                                                  N                          -                          1                                                )                                                                                            +                                                      S                    ⁡                                          (                                              N                        -                        1                                            )                                                        2                                            ]                                                          (        5.3        )            where M is a pre-calculated normalization factor, SAT[ ] represents a saturation function for truncating Q c.31 numbers to Q c.15 format, and 2−16 is an additive correction.
A conventional DSP would require at least 4N+O(1) clock cycles to calculate power P(F) in accordance with Equation (5). It should be noted that, in some implementations, the DSP can store 2 cos(2πF/Fs) as a constant only if cos(2πF/Fs) is substantially between −0.5 and 0.5; otherwise, that DSP stores cos(2πF/Fs) as a constant and multiplies it by 2 in every iteration. In other implementations, the DSP can store 2 cos(2πF/Fs) as a constant regardless of the value of cos(2πF/Fs). Assuming 2 cos(2πF/Fs) is stored as a constant, each iteration of Equation (5.2) requires MAC, multiplication, subtraction, and saturation operations that take 3 clock cycles per sample. Saturation requires another clock cycle per iteration. Equation (5.2) could be modified to remove the saturation function. However, that will make the procedure unstable and therefore liable to overflow and provide erroneous results, but would also reduce the number of cycles required for processing an N-sample frame to 3N+O(1) clock cycles.
Some DSPs have multiple arithmetic logic units (ALUs) and multiple input/output (I/O) units that can process multiple instructions in a single clock cycle utilizing the multiple ALUs, multiple I/O units, and a pipeline architecture. A pipeline architecture allows the preparation of variables for a next iteration during a current iteration. Thus, after the first few clock cycles of a data set during which the pipeline is loaded and before the last few clock cycles of the data set during which the pipeline is unloaded, the DSP operates with a loaded pipeline, where extra clock cycles are not needed to load data for use by the ALUs since the pipeline is continually loaded as the ALUs perform their arithmetic operations. Conventional tone-power calculation systems might not make optimal use of these features of such DSPs.