With the development and popularity of electronic products, electronic products have more and more functions, powerful performance, and rich experience, which bring a lot of convenience to people's lives. In the same time, user requirements of electronic products, such as convenience, etc., are becoming higher and higher. In order to meet the users' higher demand for electronic products, intelligent electronic devices can have automatic operation functions based on user's voice input.
However, different users may have different languages, different regional accents, and/or different speaking habits. Further, different voice recognition servers may have different voice recognition effects for a same voice input. The existing voice recognition equipment generally uses a single voice recognition server, which may cause semantic parsing errors. Such semantic parsing errors can result in corresponding operation errors, causing inefficient work and poor user experience.