1. Field of the Invention
The present general inventive concept relates to a method and apparatus of searching a codebook, and more particularly, to a method and apparatus to search a codebook including pulses that model a predetermined component included in a speech signal.
2. Description of the Related Art
A vocoder technique that encodes a voice using a compression/decompression technique is important in the application field of information technology, such as, mobile and satellite communications, multimedia communications, personal portable communications, and Internet phones. There are various types of vocoders. Code excited linear predictive (CELP) coding based on an analysis-by-synthesis structure is the most prevalently used in multimedia and wireless communications systems. In CELP coding, a residual signal of a vocal tract and characteristics of a glottis (i.e., a space between vocal cords or folds) are modeled by an adaptive codebook and a fixed codebook. CELP coding is implemented with different degrees of complexity and provides different qualities of synthesized sounds according to structures of the codebooks and searching processes thereof. Hence, a variety of implementations of CELP coding and associated CELP variations have been proposed.
As an example of the CELP, there is an algebraic CELP (ACELP) coding method to obtain a code vector which uses a simple algebraic method. An ACELP coding method is based on an algebraic sign structure including a combination of several amplitude (+1/−1) pulses for each frame and uses a limited number of amplitude pulses in a codebook. Accordingly, the ACELP coding method performs well in a presence of channel noise. A method of searching for a code vector using the ACLEP coding method is referred to as a fixed codebook search.
An adaptive multi-rate (AMR) wideband speech coder, which is selected as a wideband speech coder standard in an international consortium called 3rd Generation Partnership Program (3GPP), has 9 fixed bitrate transmission modes, namely, 23.85 kbps, 23.05 kbps, 19.85 kbps, 18.25 kbps, 15.85 kbps, 14.25 kbps, 12.65 kbps, 8.85 kbps, and 6.60 kbps. A fixed codebook search is based on an algebraic codebook structure and is implemented in different ways depending on different transmission modes.
FIG. 1 is a flowchart illustrating a fixed codebook searching method applied to a 8.85 kbps mode of an AMR wideband speech coder. The fixed codebook searching method of FIG. 1 is based on an algebraic codebook. A fixed codebook ck that minimizes a mean squared error (MSE) of a target signal is the same as a fixed codebook that maximizes the following Equation 1:
                              Q          k                =                                                            (                                                      x                    2                    t                                    ⁢                                      Hc                    k                                                  )                            2                                                      c                k                t                            ⁢                              H                t                            ⁢                              Hc                k                                              =                                                    (                                                      d                    t                                    ⁢                                      c                    k                                                  )                            2                                                      c                k                t                            ⁢              Φ              ⁢                                                          ⁢                              c                k                                                                        (        1        )            wherein dt denotes a correlation between the target signal and an impulse response h(n), and Φ denotes a correlation of the impulse response h(n). When each subframe is comprised of M samples, d(n) and Φ(i,j) are calculated by the following Equations 2 and 3, respectively:
                                          d            ⁡                          (              n              )                                =                                    ∑                              i                =                n                                            M                -                1                                      ⁢                                                            x                  2                                ⁡                                  (                  i                  )                                            ⁢                              h                ⁡                                  (                                      i                    -                    n                                    )                                                                    ,                  i          =          0                ,        …        ⁢                                  ,        M                            (        2        )                                                      Φ            ⁡                          (                              i                ,                j                            )                                =                                    ∑                              n                =                j                                            M                -                1                                      ⁢                                          h                ⁡                                  (                                      n                    -                    i                                    )                                            ⁢                              h                ⁡                                  (                                      n                    -                    j                                    )                                                                    ,                  i          =          0                ,        …        ⁢                                  ,        M        ,                  j          =          i                ,        …        ⁢                                  ,        M                            (        3        )            
The algebraic codebook of the 8.85 kbps mode of the AMR wideband speech coder has a structure as illustrated in Table 1. As illustrated in Table 1, one pulse for each of a total of 4 tracks is searched for, and a total of 20 bits are allocated to locations and signs of found pulses so that each found pulse is encoded.
TABLE 1TracksPulsesLocations of pulsesT1i00, 4, 8, 12, 16, 20, 24, 28, 32, 36, 40, 44, 48, 52, 56, 60T2i11, 5, 9, 13, 17, 21, 25, 29, 33, 37, 41, 45, 49, 53, 57, 61T3i22, 6, 10, 14, 18, 22, 26, 30, 34, 38, 42, 46, 50, 54, 58, 62T4i33, 7, 11, 15, 19, 23, 27, 31, 35, 39, 43, 47, 51, 55, 59, 63
The fixed codebook searching method of FIG. 1 will now be described with reference to Table 1. The fixed codebook ck includes only four vectors that are not 0, so that a fast codebook search is possible. A correlation of a numerator of Equation 1 and an energy of a denominator thereof are expressed in the following Equations 4 and 5, respectively:
                    C        =                              ∑                          i              =              0                                                      N                p                            -              1                                ⁢                                    S              i                        ⁢                          d              ⁡                              (                                  m                  i                                )                                                                        (        4        )            wherein mi denotes a location of an i-th pulse, si denotes a sign of the i-th pulse, and Np denotes a number of pulses.
                    E        =                                            ∑                              i                =                0                                                              N                  p                                -                1                                      ⁢                                          ϕ                ′                            ⁡                              (                                                      m                    i                                    ,                                      m                    i                                                  )                                              +                      2            ⁢                                          ∑                                  i                  =                  0                                                                      N                    p                                    -                  2                                            ⁢                                                ∑                                      j                    =                                          i                      +                      1                                                                                                  N                      p                                        -                    1                                                  ⁢                                                      ϕ                    ′                                    ⁡                                      (                                                                  m                        i                                            ,                                              m                        j                                                              )                                                                                                          (        5        )            
Referring to FIG. 1, in operation 11, Equations 4 and 5 are previously calculated so that a fast codebook search is possible. In addition, a value b(n) used when pulse candidate vectors which reduce a number of calculations are selected is calculated by the following Equation 6:
                              b          ⁡                      (            n            )                          =                                                                              E                  d                                                  E                  r                                                      ⁢                                          r                LTP                            ⁡                              (                n                )                                              +                      α            ⁢                                                  ⁢                          d              ⁡                              (                n                )                                                                        (        6        )            wherein Ed denotes an energy of a correlation d(n), rLTP(n) denotes a residual signal generated after pitch prediction, and Er denotes an energy of a residual signal rLTP(n).
In operation 12, candidate vectors of pulse locations of first and third tracks are selected using a value b(n) previously calculated in operation 11.
In sub-operations 13a, 13b, and 13c of operation 13, optimal locations of two pulses that maximize the value of Equation 1 are searched for from two overlapped loops that use a track t to which vector candidates belong and a track (t+1) next to the track t. With the found two pulses fixed, optimal locations of another two pulses that maximize the value of Equation 1 are searched for from two overlapped loops that use a track (t+2) to which the vector candidates belong and a track (t+3) next to the track (t+2). In sub-operations 13d through 13f of operation 13, sub-operations 13a through 13c are repeated four times, and finally four optimal pulse locations and optimal pulse signs that maximize the value of Equation 1 are determined from results of an execution of four iterations of sub-operations 13a through 13c. As described above, the fixed codebook searching method of FIG. 1 is implemented in such a manner that several candidate pulses are selected from pulses of a track according to a correlation value, and a next track is then searched. Thus, the fixed codebook searching method of FIG. 1 provides a reduction in a number of required calculations as compared with a method of searching all of the tracks simultaneously. However, even the reduced number of required calculations is considered somewhat large in light of a sound quality produced by the fixed codebook searching method of FIG. 1.