As the demand for multi-channel high-quality audio has been increased recently, the interest in digital multi-channel audio compression algorithm has been also increased. In order to research compression technologies for digital audio and video, ISO/IEC (International Standards Organization/International Electrotechnical Commission) founded ISO/MPEG (Moving Pictures Expert Group) in 1988. In 1994, ISO/MPEG started a standardization work for a new compression method available in application fields, in which compatibility with MPEG-1 stereo format was dispensable, and in the process of the work, the standard was designated MPEG-2 NBC (Non-Backward Compatible). Before starting the standardization work, ISO/MPEG had taken a comparative tests of MPEG-2 BC (Backward Compatible) compatible with MPEG-1, with Dolby's AC-3 and AT&T's MPAC, then they reached the conclusion that removing the backward compatibility resulted improvements in the performance of the coder. The goal of MPEG-2 NBC was that the quality of 5-channel full-bandwidth audio signals with a bit rate under 384 kbit/s reached the “aurally indistinguishable” level defined by ITU/R (International Telecommunication Union, Radiocommunication Bureau). Thereafter, MPEG-2 NBC was announced as a new international standard for multi-channel audio coding method in April 1997, and at that time the name was changed to MPEG-2 AAC (Advanced Audio Coding, ISO/IEC 138187). MPEG-2 AAC has been standardized through the above-mentioned process, and is an audio coding method which encodes 5-channel audio signals into high-quality audio data with the bit rate of 320 kbps (64 kbps per one channel).
FIG. 1 is a block diagram that shows an MPEG-2 AAC audio decoding algorithm in the prior art. With reference to FIG. 1, in the MPEG-2 AAC audio algorithm, high-resolution filter bank; prediction coding; sound pressure stereo coding; TNS (Temporal Noise Shaping); and Huffman coding are combined in order to provide an “aurally indistinguishable” sound quality from that of the original sound, with the bit rate under 384 kbit/s. This MPEG-2 AAC audio compression algorithm is a kind of transform coding method using MDCT (Modified Discrete Cosine Transform), and a bit allocation method based on a psychological sound model is used in order to compress the transformed signal.
Further, considering the trade-off among the sound quality, the memory usage, and the power demand, the MPEG-2 AAC audio system supports three types of profile, i.e., the main profile, the LC (Low Complexity) profile, and the SSR (Scalable Sampling Rate) profile are supported.
First, the main profile provides the best sound quality with a given bit rate, and all the tools of AAC are used only except the gain control tool. The main profile is capable of decoding the bit stream of LC profile which may be mentioned later.
Second, the LC profile is the most frequently used profile in general, both the prediction tool and the gain control tool are not used, further the degree of the TNS is limited. The LC profile is characterized by its lower memory usage and power demand than those of the main profile, though its sound quality is relatively acceptable.
And last, the SSR profile consists of the LC profile and the gain control tool. But the prediction tool is not used, moreover the bandwidth as well as the degree of the TNS is limited. The advantage of the SSR profile is that it provides variable frequency signal even though it has lower complexity than that of the main profile or the LC profile.
The most essential part of the high-quality audio compression encoding and decoding system is transforming a time domain signal into an internal time-frequency expression or running the inverse transformation. In MPEG-2 or MPEG-4 AAC, the transforming process above is executed by MDCT and IMDCT (Inverse MDCT), to which so-called TDAC (Time Domain Aliasing Cancellation) method is applied.
The above-mentioned transform coding process makes up approximately 48 percent of the total operations of the LC profile, as is shown in FIG. 2. IMDCT used in AAC audio decoder equals the following Formula 1.
                                          x            ⁡                          (              i              )                                =                                    ∑                              k                =                0                                                              N                  2                                -                1                                      ⁢                                          X                ⁡                                  (                  k                  )                                            ⁢                              cos                ⁡                                  [                                                            π                                              2                        ⁢                        N                                                              ⁢                                          (                                                                        2                          ⁢                          i                                                +                        1                        +                                                  N                          2                                                                    )                                        ⁢                                          (                                                                        2                          ⁢                          k                                                +                        1                                            )                                                        ]                                                                    ,                                  ⁢                                  ⁢                              for            ⁢                                                  ⁢            0                    ≤          i          ≤                      N            -            1                                              Formula        ⁢                                  ⁢        1            
Herein, N, I, and k indicate the number of the operation points of IMDCT, the sample index in time domain, and the sample index in frequency domain, respectively. As is shown in Formula 1, X(k)cos(·) should be accumulated N/2 times so that an x(i) sample which is a result of IMDCT can be obtained. Implementing IMDCT by its definition shown in Formula 1 with the purpose of running the transform coding process above is called a direct implementation of IMDCT. In addition, the number of the operation points of IMDCT in AAC is 2048 in case of a long block and 256 in case of a short block, respectively.
Although the direct implementation by Formula 1 can be used for IMDCT operations, high-speed IMDCT algorithm, using N/4 points IFFT (Inverse Fast Fourier Transform) which is the simplest in respect of hardware implementation and has small amount of operations in respect of IMDCT operations of 2N points as an IMDCT implementation algorithm, is commonly used. This high-speed IMDCT algorithm consists of two steps by the following Formula 2 and Formula 3.
                              y          ⁡                      (            n            )                          =                              [                                          ∑                                  k                  =                  0                                                                      N                    4                                    -                  1                                            ⁢                                                {                                                            (                                                                        X                          ⁢                                                      (                                                                                          N                                2                                                            -                                                              2                                ⁢                                k                                                            -                              1                                                        )                                                                          +                                                  j                          ·                                                      X                            ⁡                                                          (                                                              2                                ⁢                                k                                                            )                                                                                                                          )                                        ⁢                                          ⅇ                                              j                        ⁢                                                                              2                            ⁢                            π                                                    N                                                ⁢                                                  (                                                      k                            +                                                          1                              8                                                                                )                                                                                                      }                                ⁢                                  ⅇ                                      j                    ⁢                                                                  2                        ⁢                        π                                                                    N                        /                        A                                                              ⁢                    n                                                                        ]                    ⁢                      ⅇ                          j              ⁢                                                2                  ⁢                  π                                N                            ⁢                              (                                  n                  ⁢                                                                          ⁢                                      1                    8                                                  )                                                                        Formula        ⁢                                  ⁢        2                                                                                                      x                  ⁡                                      (                                          2                      ⁢                                                                                          ⁢                      n                                        )                                                  =                                  -                                                            y                      i                                        ⁡                                          (                                                                        N                          8                                                +                        n                                            )                                                                                  ,                                                                          x                ⁡                                  (                                                            2                      ⁢                                                                                          ⁢                      n                                        +                    1                                    )                                            =                                                y                  r                                ⁡                                  (                                                            N                      8                                        -                    n                    -                    1                                    )                                                                                                                                          x                  ⁡                                      (                                                                  N                        4                                            +                                              2                        ⁢                        n                                                              )                                                  =                                  -                                                            y                      r                                        ⁡                                          (                      n                      )                                                                                  ,                                                                          x                ⁡                                  (                                                            N                      4                                        +                                          2                      ⁢                      n                                        +                    1                                    )                                            =                                                y                  i                                ⁡                                  (                                                            N                      4                                        -                    n                    -                    1                                    )                                                                                                                                          x                  ⁡                                      (                                                                  N                        2                                            +                                              2                        ⁢                        n                                                              )                                                  =                                  -                                                            y                      r                                        ⁡                                          (                                                                        N                          8                                                +                        n                                            )                                                                                  ,                                                                          x                ⁡                                  (                                                            N                      2                                        +                                          2                      ⁢                      n                                        +                    1                                    )                                            =                                                y                  i                                ⁡                                  (                                                            N                      8                                        -                    n                    -                    1                                    )                                                                                                                                          x                  ⁡                                      (                                                                                            3                          ⁢                          N                                                4                                            +                                              2                        ⁢                        n                                                              )                                                  =                                                      y                    i                                    ⁡                                      (                    n                    )                                                              ,                                                                                            x                  ⁡                                      (                                                                                            3                          ⁢                          N                                                4                                            +                                              2                        ⁢                        n                                            +                      1                                        )                                                  =                                  -                                                            y                      r                                        ⁡                                          (                                                                        N                          4                                                -                        n                        -                        1                                            )                                                                                  ⁢                                                                                                      Formula        ⁢                                  ⁢        3                                for        ,                  0          ≤          n          <                      N            8                                                          
In Formula 2,
      ∑          k      =      0                      N        4            -      1        ⁢            {      g      }        ⁢          ⅇ              j        ⁢                              2            ⁢            π                                N            /            4                          ⁢        rk            is N/4 points IFFT operation. Furthermore
      (    g    )    ⁢      ⅇ          j      ⁢                        2          ⁢          π                N            ⁢              (                  k          +                      1            8                          )              ⁢          ⁢  and  ⁢          ⁢      (    g    )    ⁢      ⅇ          j      ⁢                        2          ⁢          π                N            ⁢              (                  n          +                      1            8                          )            represents the pre-processing and the post-processing of IFFT operation, respectively. Formula 3 is a de-interleaving process, herein yr and yi means real{y(n)} and image{y(n)} respectively.
On the whole, most of the general purpose DSP uses high-speed IMDCT algorithm using N/4 points IFFT in order to handle 2N points IMDCT with small amount of operations.
Referring to FIG. 3 which is a block diagram to show every step of IMDCT operation process in AAC, a complex number, X(N/2−2k−1)+jX(2k) is built out of a frequency domain input signal X(k) by using X(N/2−2k−1) and X(2k), so that the pre-processing of high-speed IMDCT can be handled. That is, for the pre-processing, the input signal X(k) made up with a real number is changed into X(N/2−2k−1)+jX(2k), which is a complex number, through a specific address generating method.
General purpose DSP chips do not support a specific instruction and hardware architecture by which X(k) written in the memory can be directly expressed as the complex number X(N/2−2k−1)+jX(2k). Accordingly, data transfer cycles, which mean sets of instructions transferring the real number data X(k) written in the memory for handling the pre-processing of high-speed IMDCT operation to the specific address form, take large part of the total operations.
As is shown in Formula 2, in case that IMDCT with 256 points is accomplished by high-speed algorithm, X(N/2−2k−1)+jX(2k), which is a complex number built out of an input signal sample, is multiplied by
  ⅇ      j    ⁢                  2        ⁢        π            N        ⁢          (              k        +                  1          8                    )      during the pre-processing in accordance with the IMDCT algorithm shown in FIG. 3. Herein, N is 256 as the number of the points of IMDCT, and k is an integer from 0 to 63 as the input index. The parameters of the formula above can be changed, because the number of the points of IMDCT used in MPEG-2 or MPEG-4 AAC audio compression algorithm is 2048 in case of a long block and 256 in case of a short block respectively.
X(k) data written in the DSP chip should be transferred to a data processing device of a core in the order of k, so that the input sample can be transformed into a complex number during the pre-processing of 256 points IMDCT, such as X(127)+jX(0) when k=0; X(125)+jX(2) when k=1; X(123)+jX(4) when k=2; and so on, then the complex number operation is accomplished. However, two address registers may be allocated in order to transfer the input sample when a general purpose DSP chip is used. For each register, post 2 decrement addressing mode is used for one and post 2 increment addressing mode is used for the other, in the process of transferring each data to the next cycle. That is, in order to make audio data except ROM data for one butterfly operation, time for at least two cycles should be consumed with two address registers. For almost all of commercial DSP chips support post decrement and increment addressing mode, address generating can be performed more efficiently. Though, there is a disadvantage that two data necessary for complex number generating cannot be transferred simultaneously.
At present, as commercial DSP chips for multi-channel high-quality audio processing, there are SHARC DSP's ASDSP-21065L; Cirrus Logic's CS49300 and CS49500; TI's (Texas Instrument) TMSc55x, TMSc64x, and TMSc67x series; LSI Logic's ZSP40x; CLARKSPUR's CD2450 and CD2480; Philips TriMedia's TM-1300 and PNX1500; and Tensilica's Xtensa. Further, ARM's ARM9M and ARM9E are also capable of AAC processing. Most of these commercial DSP chips or processors support the LC profile for multi-channel or stereo channel, moreover TI's TMSc67x, LSI Logic's ZSP series, and SHARC DSP's ASDSP-21065L can support the main profile of AAC.
In general, commercial DSP chips for audio processing assign 24 or 32 bits for data expressions, and they are designed to hold sufficient memory space or to facilitate the I/O with external audio signals so that multi-channel audio processing can be accomplished. Further, in almost every DSP for multi-channel audio system, many hardware resources are run in parallel so as to handle the audio data more than 5.1 channels in real time. For example, SHARC DSP's ASDSP-21065L processor has a Super-Harvard architecture which is capable of running both SIMD (Single Instruction Multiple Data) and SISD (Single Instruction Single Data), then many hardware resources can be run in parallel. In addition, TMS320c64x, TMS320c67x, TM-1300, and PNX1500 are VLIW (Very Long Instruction Word) processors, and they run quite many hardware resources in parallel by program control using a compiler which is software. In other words, the DSP operation core has Super-Harvard or VLIW architecture in most of the audio only DSP released by commercial DSP chip developing companies, further in many cases, DSP essentially has many ALUs (Arithmetic and Logic Unit) and other hardware resources so that various audio algorithms can be run at high speed. Moreover, in comparison with DSP core, peripheral devices are used more exclusively by audio I/O operations, so in many cases, there exist exclusive instructions not for audio signal processing operations but for control of the peripheral devices related to I/O of the audio signals.
However, most of these commercial DSP cores had disadvantages that, their size and the amount of power consumed were relatively large due to their architectural characteristics, and as a result, the efficiency of implementation was lowered when the chips were implemented with SoC (System on a Chip).