1. Field of the Invention
The present invention relates to encoding and decoding of an audio signal, and more particularly, to an apparatus and method for transforming an audio signal by selecting a frame of frames of various lengths according to a change in an audio signal, and transforming, encoding, and decoding the audio signal in units of the selected frame using a window coefficient other than 0; an apparatus and method for encoding an audio signal adaptively to a change in the audio signal; an apparatus and method for inversely transforming an audio signal, and an apparatus and method for decoding an audio signal adaptively to a change in the audio signal.
2. Description of Related Art
Conventionally, an audio signal is encoded by transforming it into units of a predetermined frame, and generating a bit stream by changing a bit rate of the transformed audio signal by the quantizing the transformed audio signal. The length of a frame of an audio signal must be determined by the degree that the audio signal changes. Specifically, the frame length of an audio signal that changes fast in a time domain must be determined to be smaller so that the audio signal can be processed into a frequency domain over a broad band of frequency, thereby generating a more precise bit stream. In contrast, the frame length of an audio signal that changes slowly in the time domain must be determined to be larger so that the audio signal can be processed into the frequency domain over a narrow band of frequency, thereby reducing consumption of frequency resources.
Conventionally, the types of frames are limited, for example, frames are categorized into a long frame and a short frame. Therefore, an audio signal that rapidly changes to a large extent is encoded using oversampled transform, thereby causing distortion of the encoded audio signal.
FIG. 1 is a table illustrating conventional frame types and related window coefficients. Referring to FIG. 1, there are a long frame and a short frame, and a long start frame and a long stop frame that are obtained by transforming the long and short frames, respectively. When performing a windowing operation on the long start frame and the long stop frame, they have a window coefficient of 0.
FIG. 2 is a graph illustrating transforming of an audio signal, which has a window coefficient of 0, into a frequency domain using the windowing operation.
A method of transforming and inversely transforming an audio signal will now be described briefly. Typically, an audio signal is transformed into a frequency domain using a Modified Discrete Cosine Transform (MDCT). According to the MDCT, a z signal is obtained by multiplying input data on a time axis by a window coefficient illustrated in FIG. 2. Next, a final frequency-domain spectrum is computed by substituting the value of the z signal for the following equation:
                                          X                          i              ,              k                                =                                    2              ·                                                ∑                                      n                    =                    0                                                        N                    -                    1                                                  ⁢                                                      z                                          i                      ,                      n                                                        ⁢                                      cos                    ⁡                                          (                                                                                                    2                            ⁢                            π                                                    N                                                ⁢                                                  (                                                      n                            +                                                          n                              0                                                                                )                                                ⁢                                                  (                                                      k                            +                                                          1                              2                                                                                )                                                                    )                                                        ⁢                                                                          ⁢                  for                  ⁢                                                                          ⁢                  0                                                      ≤            k            <                          N              /              2                                      ,                            (        1        )            wherein Xi,k denotes the value of a frequency domain, zin denotes a windowed input sequence, n denotes the index of a sample unit, k denotes the index of a spectral coefficient, i denotes a frame index, N denotes the length of a frame, and n0 denotes (N/2+1)/2.
The encoded audio signal is inversely transformed into a time domain using the following equation:
                                          x                          i              ,              n                                =                                                    2                N                            ⁢                                                ∑                                      k                    =                    0                                                                              N                      2                                        -                    1                                                  ⁢                                                                            spec                      ⁡                                              [                        i                        ]                                                              ⁡                                          [                      k                      ]                                                        ⁢                                      cos                    ⁡                                          (                                                                                                    2                            ⁢                            π                                                    N                                                ⁢                                                  (                                                      n                            +                                                          n                              0                                                                                )                                                ⁢                                                  (                                                      k                            +                                                          1                              2                                                                                )                                                                    )                                                        ⁢                                                                          ⁢                  for                  ⁢                                                                          ⁢                  0                                                      ≤            n            <            N                          ,                            (        2        )            wherein xi,n denotes the value obtained by inversely transforming the encoded audio signal.
As described above, conventionally, when using the MDCT to transform an audio signal into a frequency domain, a portion of a first frame unit of the audio signal ranging from 1538+128 to 2048 of the time axis is transformed using a window coefficient of 0. Frame samples obtained in this case are multiplied by the window coefficient of 0, and thus, the results of multiplication are neglected. Although 1024 spectrum values are obtained by using the first frame unit according to the characteristics of the MDCT, the effect of the MDCT is lowered when the window coefficient is 0.