Conventionally, in an acoustic system for e.g., a video conference system, sound pickup apparatus using multiple microphones have been used to pick up sound clearly from a sound source (for example, a speaker). Such a sound pickup apparatus generates a multichannel signal for reproducing, for example, the position of a sound source of a communication user site at a communication partner site (achieving sound image localization) using multiple microphones.
In such a sound pickup apparatus, multiple microphones are provided corresponding to respective channels. Also, the multiple microphones are fixedly installed with respective main axis directions of directivity toward the directions according to the corresponding channels. The sound pickup apparatus then can generate each picked-up sound signal as a multichannel signal for achieving sound image localization. The generated multichannel signal is transmitted to multiple loudspeakers at the communication partner site via a communication network. Accordingly, multichannel sound is reproduced at the communication partner site, and the position of a speaker at the user site is reproduced at the communication partner site.
In order to generate a multichannel signal for achieving sound image localization, multiple microphones need to be fixedly installed with respective main axis directions of directivity toward the directions according to the corresponding channels. Thus, in the above-mentioned sound pickup apparatus, the speaker cannot freely change the arrangement positions of the multiple microphones.
Now, in order to solve the above-mentioned problem, a sound pickup apparatus 190 as shown in FIGS. 16 and 17 has been proposed (for example, see PTL 1). FIG. 16 is a schematic view of a conventional sound pickup system. FIG. 17 is a block diagram showing the functional configuration of a conventional video conference system. As shown in FIG. 17, the video conference system includes a first sound pickup system 1000 installed at a user site, and a second sound pickup system 2000 installed at a communication partner site. Because the second sound pickup system 2000 has a configuration similar to that of the first sound pickup system 1000, a schematic view of the second sound pickup system 2000 is omitted in FIG. 16.
In the examples of FIGS. 16 and 17, a right channel (hereinafter referred to as a “R channel” or “Rch”) signal, and a left channel (hereinafter referred to as an “L channel” or “Lch”) signal are generated as multichannel signals, and stereo reproduction is achieved at the communication partner site.
A microphone 90a is installed on a table 103 so as to be placed in the front vicinity of a speaker 102a. A microphone 90b is installed on a table 103 so as to be placed in the front vicinity of a speaker 102b. A monitor 104 is a device for displaying an image captured by a camera 205 at the communication partner site, and is installed in front of the speakers 102a, 102b. The image of the communication partner site is inputted to the monitor 104 via a communication network 107.
A camera 105 is installed on the upper portion of the monitor 104, and captures the speakers 102a, 102b at the user site. The image of the user site is transmitted to a monitor 204 of the communication partner site via the communication network 107.
The first and second loudspeakers 106a, 106b reproduce an L channel signal or a R channel signal inputted from a sound pickup apparatus 290 of the communication partner site via the communication network 107. The first and second loudspeakers 106a, 106b are each installed on either side of the monitor 104. Similarly, a first loudspeaker 206a of the communication partner site is installed on the front left as viewed from the communication partner, and a second loudspeaker 206b of the communication partner site is installed on the front right as viewed from the communication partner.
The sound pickup apparatus 190 is installed at the user site, and the sound pickup apparatus 290 is installed at the communication partner site. Because the internal configuration of the sound pickup apparatus 290 is similar to that of the sound pickup apparatus 190, the drawing and description for the sound pickup apparatus 290 are omitted herein.
The sound pickup apparatus 190 includes the microphones 90a and 90b, a microphone position determining unit 91, a coefficient calculating unit 92, a microphone detecting unit 93, and a signal calculating unit 94. In the following, each component of the sound pickup apparatus 190 is specifically described.
The microphone position measuring unit 91 outputs a measurement signal to the first and second loudspeakers 106a, 106b. Subsequently, the microphone position measuring unit 91, after outputting the measurement signal, calculates a time period as a delay time until the measurement signal is picked up by the microphones 90a, 90b. The microphone position measuring unit 91 measures the current position of the microphones 90a, 90b using the calculated delay time.
In the example of FIG. 16, because the microphone 90a is placed on the right side position as viewed from the monitor 104, the right side position is measured as the current position of the microphone 90a. Also because the microphone 90b is placed on the left side position as viewed from the monitor 104, the left side position is measured as the current position of the microphone 90b. The microphone position measuring unit 91 measures those current positions for every movement of the microphones 90a, 90b so that a speaker can freely move the microphones 90a, 90b. 
The coefficient calculating unit 92 calculates the ratio (coefficient ratio) between the level assigned to the R channel signal and the level assigned to the L channel signal based on the measured current positions of the microphones 90a, 90b so that multichannel signals for achieving sound image localization are generated.
In the example of FIG. 16, the measured current position of the microphone 90a is on the right as viewed from the monitor 104. Thus, the coefficient calculating unit 92 determines, for example, (R channel signal:L channel signal)=(1:0) as the coefficient ratio of the microphone 90a. On the other hand, the measured current position of the microphone 90b is on the left as viewed from the monitor 104. Thus, the coefficient calculating unit 92 determines, for example, (R channel signal:L channel signal)=(0:1) as the coefficient ratio of the microphone 90b. 
When either one of the speakers 102a or 102b speaks, the microphone detecting unit 93 detects a microphone nearest to the speaker based on the levels of the picked-up sound signals from the microphones 90a, 90b. For example, when the speaker 102a speaks, the level of the picked-up sound signal from the microphone 90a becomes greater than that of the picked-up sound signal from the microphone 90b. In this case, the microphone detecting unit 93 detects the microphone 90a as the microphone nearest to the speaker. Subsequently, the coefficient calculating unit 92 determines the coefficient ratio for the microphone 90a, (R channel signal:L channel signal)=(1:0) as the coefficient ratio to be outputted to the signal calculating unit 94 based on the microphone 90a detected by the microphone detecting unit 93.
The signal calculating unit 94 calculates the R channel signal and L channel signal according to the determined coefficient ratio. For example, in the case where the coefficient ratio for microphone 90a is (R channel signal:L channel signal)=(1:0), the signal calculating unit 94 calculates the R channel signal by multiplying respective picked-up sound signals of the microphones 90a, 90b by a coefficient 1 and adding the multiplied picked-up sound signals. On the other hand, the signal calculating unit 94 calculates the L channel signal by multiplying respective picked-up sound signals of the microphones 90a, 90b by a coefficient 0 and adding the multiplied picked-up sound signals.
Accordingly, the R channel signal forms a signal to which all the picked-up sound signals from the microphones 90a, 90b are added, and the L channel signal has no output, thus multichannel signals for achieving sound image localization are generated. The L channel signal (Lch) and R channel signal (Rch) which are calculated in the signal calculating unit 94 are transmitted to the loudspeakers 206a, 206b of the communication partner site via the communication network 107. Accordingly, at the communication partner site, sound is reproduced as if the speaker 102a speaks from the right position as viewed from a speaker of the communication partner site.
In this manner, the sound pickup apparatus 190 shown in FIGS. 16 and 17 measures the position (current position) of each microphone after every movement of the microphone, and multichannel signals for achieving sound image localization are generated by using the information on the current position of the measured microphone. Consequently, the speaker can freely change the arrangement positions of the microphones.