1. Technical Field
The present invention relates to a video and audio output system including an audio output device and a video and audio output device placed above the audio output device, for outputting an acoustic signal so as to localize a sound image of the acoustic signal at a predetermined position.
2. Background Art
In recent years, screens of flat-screen televisions such as plasma televisions, liquid crystal televisions, and the like have become larger, which makes it possible to enjoy strongly appealing video images at home. In the future, further thinning and weight saving as well as screen enlargement are demanded in order to realize a wall-hung television, for example. In addition, loudspeakers installed in the flat-screen televisions are also to be downsized and thinned. This narrows a range of the amplitude-frequency characteristics of a sound outputted from the loudspeaker, and moreover flatness thereof is impaired. In this manner, audio performance is sacrificed for improving video performance.
Thus proposed is an AV rack loudspeaker apparatus having a high-sound-quality loudspeaker installed in a television stand. This AV rack loudspeaker apparatus makes it possible to easily enjoy high-quality sounds without any need to separately provide an external high-sound-quality loudspeaker. In addition, this AV rack loudspeaker apparatus is equipped with a sound image localization control function for localizing a front channel sound in a direction beyond the location of the loudspeaker so that a viewer can enjoy more powerful sound effects.
However, in general use, the AV rack loudspeaker apparatus is placed on a floor and a television is mounted thereon. This causes a new problem that a sound image of a center channel or a sound image of the front channel subjected to a sound image localization control is localized near the floor so that a video image and a sound image appear at different heights, which causes a sense of incongruity.
As a technique for localizing a sound image at a desired position, a sound image localization control technique that corrects a head-related acoustic transfer function (hereinafter referred to as an HR transfer function) is conventionally in wide practical use. FIG. 23 shows diagrams illustrating the conventional sound image localization control technique by which an R-channel signal is processed and localized on the right side of a video display 105 at the same height as that of the video display 105. The diagram (a) of FIG. 23 shows a signal processing configuration, and the diagram (b) of FIG. 23 shows localization positions of sound images.
FIR filters 101a and 101b process the R-channel signal so as to correct the amplitude-phase characteristics to desired characteristics. Loudspeakers 102a and 102b convert electric signals outputted from the FIR filters 101a and 101b respectively into acoustic signals, and then output the signals. In order to localize a sound image at the position of a target sound image 103a with respect to a viewer 104 in FIG. 23, transfer functions G1 and G2 that satisfy the following Equation 1 are calculated, and coefficients whose processing characteristics are the transfer functions G1 and G2 are provided to the FIR filters 101a and 101b. Here, HR transfer functions for transfer from the loudspeaker 102a to the left and right ears of the viewer 104 are defined as C1 and C2. HR transfer functions for transfer from the loudspeaker 102b to the left and right ears of the viewer 104 are defined as C3 and C4. HR transfer functions for transfer from a loudspeaker that is supposedly placed at the position of the target sound image 103a to the left and right ears of the viewer 104 are defined as H1 and H2.
                                          [                                                                                C                    1                                                                                        C                    3                                                                                                                    C                    2                                                                                        C                    4                                                                        ]                    ⁡                      [                                                                                G                    1                                                                                                                    G                    2                                                                        ]                          =                  [                                                                      H                  1                                                                                                      H                  2                                                              ]                                    [                  Equation          ⁢                                          ⁢          1                ]            
However, when the transfer functions G1 and G2 of the FIR filters 101a and 101b are fixed, a change in the HR transfer functions C1 to C4 resulting from a change in a viewing position of the viewer 104 causes a localization position of a sound image to shift from the position of the target sound image 103a. In particular, due to a change in phase-frequency characteristics of the HR transfer functions C1 to C4, a composite sound made up of audio outputs from both loudspeakers shows an extreme change in amplitude-frequency characteristics at both ears. Such a change in amplitude-frequency characteristics appears prominently in a high-frequency component having a short wavelength.
Meanwhile, extensive studies have been conventionally conducted on a causal relationship between sound image recognition and an HR transfer function. According to the studies, there is found that human beings grasp a lateral-direction angle of a sound image based on differences, in level and phase of an HR transfer function, between both ears, and grasp a height-direction angle of the sound image based on a shape of amplitude-frequency characteristics of the HR transfer function.
FIG. 24 shows an example of HR transfer functions of sound sources positioned at different heights. A sound source A is positioned 60 degrees horizontally right relative to a front direction of the viewer 104, and a sound source B is positioned 30 degrees vertically down from the sound source A (a diagram (a) of FIG. 24). Comparing between the sound source A and the sound source B with respect to the amplitude-frequency characteristics, it can be seen that their shapes largely differ in the frequency band higher than 1 kHz (a diagram (b) of FIG. 24). In particular, unique peak characteristics and notch characteristics appearing in the high frequency band of 4 kHz or higher are widely known as an important clue for recognition of a sound image height. In addition, Non-Patent Document 1 reports that two notch characteristics appearing in the frequency band of 4 kHz to 16 kHz are important clues for height localization or front-and-back localization of the front-direction sound.
Thus, a high-frequency component in the HR transfer function serves as the clue for recognition of the sound image height, but there is an problem that an error is likely to occur in Equation 1 and the sound image is not localized at a desired height. As shown in the diagram (b) of FIG. 23, for the viewer 104, a sound image is localized at a desired position with respect to the lateral direction. However, the viewer 104 cannot hear the unique characteristics of a high-frequency component, which serves as the clue for recognition of the sound image height, in the desired HR transfer functions H1 and H2. Instead, the viewer 104 hears the characteristics of the HR transfer functions of the loudspeakers 102a and 102b that are actually outputting audio. Therefore, with respect to the height direction, the sound image is undesirably localized at the height of the loudspeakers 102a and 102b (a sound image 103b).
As described above, according to the conventional sound image localization control technique, a sound image localization in the lateral direction can be realized, but the sound image cannot actually be localized at a height different from the height of the loudspeaker that outputs audio.
Patent Document 1 discloses a processing circuit that localizes a sound image at a position of a video monitor by using loudspeakers located at different heights. FIG. 25 is a diagram showing a conventional processing circuit 106 described in Patent Document 1. FIG. 25 illustrates an example in which a C-channel signal is localized at the position of the video monitor.
In FIG. 25, an equalizer 107 corrects the amplitude-frequency characteristics of a C-channel signal Cin. A band-pass filter 108 extracts, from an output from the equalizer 107, only components belonging to a predetermined frequency band. A band-elimination filter 109 extracts, from an output from the equalizer 107, components except components belonging to a predetermined frequency band. Amplifiers 110a to 110d amplify an L-channel signal Lin, an output from the band-pass filter 108, an output from the band-elimination filter 109, and an R-channel signal Rin at predetermined gains, respectively. An adder 111a adds together an output from the amplifier 110a and an output from the amplifier 110b. An adder 111b adds together an output from the amplifier 110b and an output from the amplifier 110d. 
An output from the adder 111a is, as an L-channel sound, outputted from a loudspeaker 102c placed on the left side of a video monitor 105. An output from the adder 111b is, as an R-channel sound, outputted from a loudspeaker 102e placed on the right side of the video monitor 105. An output from the amplifier 110c is, as a C-channel sound, outputted from a loudspeaker 102d placed on the upper side of the video monitor 105 (or a loudspeaker 102f placed on the lower side of the video monitor 105). A predetermined processing coefficient is provided to the equalizer 107 so as to make the viewer feel as if the front-direction C-channel loudspeaker 102d (or 102f) outputs sounds that are actually outputted from the L-channel loudspeaker 102c and the R-channel loudspeaker 102e located approximately at ±30 degrees. This processing coefficient is a coefficient for performing a process with the same amplitude characteristics as those of a transfer function obtained by dividing an HR transfer function for transfer from the C-channel loudspeaker 102d (or 102f) to the viewer by an HR transfer function for transfer from the L/R-channel loudspeakers 102c and 102e to the viewer.
Patent Document 1: Japanese Laid-Open Patent Publication No. 2004-266604
Non-Patent Document 1: Iida et al., “A novel head-related transfer function model based spectral and interaural difference cues”, WESPAC9, September 2006