Artificial Intelligence, abbreviated as AI, is a new scientific technology that studies and develops theories, methods, technologies and application systems for simulating, extending and expanding human intelligence. AI is a branch of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine which can react in a way similar to human intelligence. Research in this area includes robotics, speech recognition, image recognition, natural language processing, expert system and the like.
There are currently two main kinds of audio processing as follows: one is to process the audio by changing the base frequency or the format of the audio; the other is the speech synthesis. When performing the speech synthesis, the first step is to collect marked data of the template audio, then a model is trained based on the collected data. Since the model is input in text, it is required to first perform speech recognition to the to-be-synthesized audio, and then input the recognized text into the trained model to generate the audio. The first method lacks flexibility with regard to the audio processing and cannot achieve a good processing effect. The second method requires a large amount of sample audio data. Furthermore, the to-be-processed audio needs to be first converted to text during the processing, resulting in a low processing efficiency.