Speech recognition is a core technology of a human-computer interaction interface of a current intelligent information system. To improve a success rate of speech recognition, a solution of collecting a sound signal by using a sound collection sensor is generally used, and collection and speech recognition of a sound signal are performed according to a sound emitting position.
Currently, in the solution of improving a success rate of speech recognition, a sound signal emitted from only one position can be extracted. A sound signal emitted from another position can only be considered as noise and filtered out. As a result, the sound signal cannot be accurately extracted, a sound emitting position cannot be located, and speech recognition cannot be performed. An in-vehicle system installed in a car is used as an example. Currently, a sound signal in an ambient environment may be collected by using a sound collection sensor installed in the in-vehicle system, a sound signal emitted from a driver compartment is extracted, and speech recognition is performed on the extracted sound signal emitted from the driver compartment. The in-vehicle system may respond to the sound signal emitted from the driver compartment. However, a sound signal emitted from a front passenger compartment or a sound signal emitted from a back seat in the car is determined as noise and filtered out by the in-vehicle system. As a result, the sound signal cannot be accurately extracted, a sound emitting position cannot be located, and speech recognition cannot be performed. For example, the in-vehicle system may extract and perform speech recognition on a speech command “Open the sunroof” emitted from the driver compartment. However, a speech command “Open the sunroof” emitted from another position such as the front passenger compartment or the back seat in the car cannot be extracted, and an emitting position of another sound signal in the in-vehicle system cannot be located. Therefore, in an application scenario of an in-vehicle system in a car, the in-vehicle system cannot efficiently and accurately locate an emitting position of another sound signal in the car. Consequently, efficiency of locating an emitting position of a sound signal is reduced, and user experience is poor.