1. Field
Methods and apparatuses consistent with exemplary embodiments relate to speech signal processing, and more particularly, to speech signal processing methods and speech signal processing apparatuses, which are capable of protecting personal information while using a personalized model.
2. Description of the Related Art
Speech recognition (SR) is a technology for converting a user's speech into a text. Since such a process is automatically performed, the speech recognition is also referred to as automatic speech recognition (ASR). In smartphones or televisions (TVs), the speech recognition is widely used as an interface technology for replacing a keyboard input. Natural language understanding (NLU) is a technology that extracts the meaning of a user's speech from a recognition result of the speech recognition. Instead of simply recognizing the user's speech, the meaning of the user's speech may be determined more accurately by performing higher level analysis of the user's speech.
An ASR/NLU system may be divided into two modules, that is, a client that receives a speech signal and an ASR/NLU engine that performs ASR and NLU on the speech signal. In order to increase speech signal processing speed, the two modules may be designed to be separate from each other. In this case, a device, such as a smartphone or a TV, which has limitations in processing capacity and data storage capacity, may be configured as a client, and the ASR/NLU engine may be configured in an independent server form having high arithmetic capacity. These two modules may be connected to each other via a network. The device is located at a position close to a user and serves to receive a speech signal. The server having a high data processing speed serves to perform ASR and NLU. As another configuration, an ASR/NLU engine may be mounted inside the device as well as the server, so that the two ASR/NLU engines perform ASR and NLU in cooperation with each other.
One of the methods of increasing the performance of the ASR/NLU system is to collect data for each user and generate a model for each user. Such a model for each user is referred to as a personalized model, and such a method is referred to as a personalized modeling. Since the personalized model is enabled to generate a module customized for a specific individual, the personalized model usually has a higher performance than a general model generated for many unspecified persons. However, in the case of the personalized modeling, it is necessary to use a user's personal information so as to generate the personalized model. A problem of information protection may occur in a process of transmitting and processing personal information. When an encryption technology is applied for solving this problem, a processing speed may be reduced.