1. Field of the Invention
The invention relates to a voice-controlled data system and to a method for a voice-controlled selection, generation or compilation of media files.
2. Related Art
For many applications, e.g., multimedia systems including audio/video players, users may select audio or video files from a large list of files, e.g., music titles. Furthermore, the use of media files available over a centralized data base usable for a variety of users has become very well known. The download of audio or video files from a communication network, e.g., the Internet, has become a widespread phenomenon due to the fact that systems have been developed that allow the storing of audio and video data files in a compact way by using different compression techniques. In the art, many different formats for storing media data have been developed, e.g., the MP3 format, the AAC format, the WMA format, the MOV format, and the WMV format. For the user, it has become possible to configure a selection of different audio or video files that may be stored on one storage medium.
Additionally, many formats also allow the storing of meta-data corresponding to the media file. In many of these formats, meta-data are provided containing information about the file itself or any other information relating to this file. Such meta-data or meta-information may include data such as the title of the file, allowing the identification of the data, the artist, the year of recording, the genre, the tracks, etc.
Additionally, the voice-controlled operation of multimedia systems is well known in the art. Especially in vehicles, the voice-controlled operation of electronic systems comprising an audio module, a navigation module, a telecommunication module, and/or a radio module is a useful feature for the driver that helps him to focus on the traffic. To this end, speech recognition units are used in which a voice command from the user of the electronic system is detected and phonetic transcriptions of the detected voice command are used for executing the command of the user.
Often times, identification data allowing the identification of the media files includes data in different languages. If an entry is to be selected by speech recognition, a problem arises that neither the language of the intended entry nor the language in which the name of the intended entry is pronounced is known. The fact that the entries of the media files to be selected have names in different languages and that the language of the intended entry is not known complicates the speech recognition process. The phonetic transcriptions can either be generated automatically or they can be searched in large look-up tables containing examples of phonetic transcriptions. With the use of automatically generated phonetic transcriptions the recognition rate of the control command is low, and the use of look-up tables containing phonetic transcriptions is hardly possible when the control command comprises proper names of different languages.
In summary, often the language of the speech command input into the speech recognition unit for selecting one of the media files is not known. This complicates the speech recognition process, in particular when the user pronounces a foreign language name for one file in his own (different) mother language. The control of an electronic system having media files to select one of the files is a difficult task, because the speech recognition system has to recognize the speech input from the user, which may comprise variable vocabulary, e.g., the name or the title of the media file.
Therefore, a need exists for a system that is able to allow for the voice-controlled selection of a media file from a group of several media files containing data in different languages.