1. Field of the Invention
The present invention relates to an apparatus for processing sound and video in which adjustment is performed in accordance with turning of the head of a user by using a sound image localization process, a process for adjusting a video clipping angle or the like, and also to a method for use in the apparatus.
2. Description of the Related Art
Sound signals accompanying a video such as a movie are recorded on the assumption that the sound signals are to be reproduced by speakers installed on both sides of a screen. In such setting, the positions of sound sources in the video coincide with the positions of sound images actually heard, forming a natural sound field.
When the sound signals are reproduced using headphones or earphones, however, the sound images are localized in the head and the directions of the visual images do not coincide with the localized positions of the sound images, making the localization of the sound images extremely unnatural.
This is also the case when music accompanied by no video is listened to. In this case, music being played is heard from inside the head unlike the case where the music is reproduced by speakers, also making the sound field unnatural.
As a scheme for hindering reproduced sound from being localized in the head, a method for producing a virtual sound image by head-related transfer functions (HRTF) is known.
FIGS. 8 to 11 illustrate the outline of a virtual sound image localization process performed by the HRTFs. The following describes a case where the virtual sound image localization process is applied to a headphone system with two left and right channels.
As shown in FIG. 8, the headphone system of this example includes a left-channel sound input terminal 101L and a right-channel sound input terminal 101R.
As stages subsequent to the sound input terminals 101L, 101R, a signal processing section 102, a left-channel digital/analog (D/A) converter 103L, a right-channel D/A converter 103R, a left-channel amplifier 104L, a right-channel amplifier 104R, a left headphone speaker 105L, and a right headphone speaker 105R are provided.
Digital sound signals input through the sound input terminals 101L, 101R are supplied to the signal processing section 102, which performs a virtual sound image localization process for localizing a sound image produced from the sound signals at an arbitrary position.
After being subjected to the virtual sound image localization process in the signal processing section 102, the left and right digital sound signals are converted into analog sound signals in the D/A converters 103L, 103R. After being converted into analog sound signals, the left and right sound signals are amplified in the amplifiers 104L, 104R, and thereafter supplied to the headphone speakers 105L, 105R. Consequently, the headphone speakers 105L, 105R emit sound in accordance with the sound signals in the two left and right channels that have been subjected to the virtual sound image localization process.
A head band 110 for allowing the left and right headphone speakers 105L, 105R to be placed over the head of a user is provided with a gyro sensor 106 for detecting turning of the head of the user as described later.
A detection output from the gyro sensor 106 is supplied to a detection section 107, which detects an angular speed when the user turns his/her head. The angular speed from the detection section 107 is converted by an analog/digital (A/D) converter 108 into a digital signal, which is thereafter supplied to a calculation section 109. The calculation section 109 calculates a correction value for the HRTFs in accordance with the angular speed during the turning of the head of the user. The correction value is supplied to the signal processing section 102 to correct the localization of the virtual sound image.
By detecting turning of the head of the user using the gyro sensor 106 in this way, it is possible to localize the virtual sound image at a predetermined position at all times in accordance with the orientation of the head of the user.
That is, the virtual sound image is not localized in front of the user but remains localized at the original position even if the user turns his/her head.
The signal processing section 102 shown in FIG. 8 applies transfer characteristics equivalent to transfer functions HLL, HLR, HRR, HRL from two speakers SL, SR installed in front of a listener M to both ears YL, YR of the listener M as shown in FIG. 9.
The transfer function HLL corresponds to transfer characteristics from the speaker SL to the left ear YL of the listener M. The transfer function HLR corresponds to transfer characteristics from the speaker SL to the right ear YR of the listener M. The transfer function HRR corresponds to transfer characteristics from the speaker SR to the right ear YR of the listener M. The transfer function HRL corresponds to transfer characteristics from the speaker SR to the left ear YL of the listener M.
The transfer functions HLL, HLR, HRR, HRL may be obtained as an impulse response on the time axis. By implementing the impulse response in the signal processing section 102 shown in FIG. 8, it is possible to regenerate a sound image equivalent to a sound image produced by the speakers SL, SR installed in front of the listener M as shown in FIG. 9 when reproduced sound is heard with headphones.
As discussed above, the process for applying the transfer functions HLL, HLR, HRR, HRL to the sound signals to be processed is implemented by finite impulse response (FIR) filters provided in the signal processing section 102 of the headphone system shown in FIG. 8.
The signal processing section 102 shown in FIG. 8 is specifically configured as shown in FIG. 10. For the sound signal input through the left-channel sound input terminal 101L, an FIR filter 1021 for implementing the transfer function HLL and an FIR filter 1022 for implementing the transfer function HLR are provided.
Meanwhile, for the sound signal input through the right-channel sound input terminal 101R, an FIR filter 1023 for implementing the transfer function HRL and an FIR filter 1024 for implementing the transfer function HRR are provided.
An output signal from the FIR filter 1021 and an output signal from the FIR filter 1023 are added by an adder 1025, and supplied to the left headphone speaker 105L. Meanwhile, an output signal from the FIR filter 1024 and an output signal from the FIR filter 1022 are added by an adder 1026, and supplied to the right headphone speaker 105R.
The thus configured signal processing section 102 applies the transfer functions HLL, HLR to the left-channel sound signal, and applies the transfer functions HRL, HRR to the right-channel sound signal.
By using the detection output from the gyro sensor 106 provided in the head band 110, it is possible to keep the virtual sound image localized at a fixed position even if the user turns his/her head, allowing produced sound to form a natural sound field.
In the foregoing, a description has been made of a case where the virtual sound image localization process is performed on the sound signals in the two left and right channels. However, the sound signals to be processed are not limited to sound signals in the two left and right channels. Japanese Unexamined Patent Application Publication No. Hei 11-205892 describes in detail an audio reproduction apparatus adapted to perform a virtual sound image localization process on sound signals in a multiplicity of channels.