1. Field of the Invention
The present invention relates to an apparatus and a method for processing voices, which can accept voice input of data, as well as a storage medium for materializing the apparatus and the method, more particularly to an apparatus, a method, and a storage medium usable for a data processing apparatus for managing schedules.
2. Description of the Related Art
Conventionally, so-called schedule books used to manage schedules have been made of paper. Dates and times, as well as other schedule items on those dates and times have been individually written by the user using a writing implement respectively. In recent years, however, data processing apparatuses such as electronic notebooks, personal computers and potable data terminals, and software programs used for managing schedules in those data processing apparatuses have been spread. Under such the circumstances, schedule books have been replaced with those data processing apparatuses and schedule managing software programs are executed in such data processing apparatuses to manage personal schedules. There are also cases that a plurality of persons are managing the progress of their group work carried out in cooperation using a method conforming to the above personal schedule management to make it easier to operate the management.
Such schedule data used basically for managing schedules is composed of date and time data, as well as other schedule items. For example, if a schedule is a meeting with a customer, the schedule items will indicate the meeting place and the customer's name. If this schedule is notified from the customer to the user by telephone, the user will know the schedule in the talking with the customer using voices. The user will then divide the schedule data into the date and time and other schedule items and enter the data to his/her data processing apparatus manually. The manual input of data mentioned here device entering the data, for example, characters and symbols representing the date and time, as well as other schedule items by operating, for example, a keyboard manually. In the present specification, it is noted that a voice not only a human voice but also a sound, for example, a synthesized sound.
If the user's telephone is an automatic phone-answering machine and the user is away on a business trip, for example, when a customer makes a phone call, the voice of the customer is stored as is in the memory of the automatic phone-answering machine. The user thus listens to the voice stored in the memory thereby to know the schedule data as soon as he/she returns. The user then divides the schedule data into the date and time and other schedule items thereby to enter the schedule data in his/her data processing apparatus manually. The data processing apparatus regards the date and time as attribute data for classifying the entered schedule data, then makes the date and time correspond to other schedule items and stores them all. The apparatus then classifies and disposes a plurality of stored schedule data according to the date and time.
Since the user enters the schedule data to his/her data processing apparatus manually when receiving schedule data as a voice such way, sometimes wrong schedule data is entered, for example, because of the user's mishearing of the schedule data and/or mishandling of his/her input device. In order to prevent such schedule data errors, there is proposed a method that the data processing apparatus or the telephone itself recognizes a voice and the voice outputted from the telephone are converted to a character string composed of a plurality of characters.
The applicant of the present invention thus proposes a video telephone system disclosed in Japanese Unexamined Patent Publication JP-A-3-88592 as the first related art voice recognition technology. This video telephone system, when used as a receiver, converts a voice data to character data and displays the character data visually to transmit the user's intention of the video telephone system used as a transmitter to the user of the video telephone system used as a receiver easily and accurately regardless of the voice sensitivity of the video telephone system and existence of noises around the video telephone system.
As the second related art voice recognition technology, the inventor proposes a voice recognition telephone set disclosed in Japanese Unexamined Patent Publication JP-A-3-32148. This voice recognition telephone requests the user to hold down a recognition button only while the talker is giving his/her address to the user orally. This is to eliminate both troublesome manual input and check of an address given by telephone. While the recognition button is held down, the telephone recognizes signals representing a voice on the telephone line as a voice and stores the result of recognition.
Furthermore, as the third related art voice recognition technology, the inventor proposes a message slip output device disclosed in Japanese Unexamined Patent Publication JP-A- 3-38721. This message slip output device requests the user to hold down any one of its plural buttons while the user or the talker is giving words to be described on a message slip, so that the message addressed with a voice is described on a message slip automatically. While the button is held down, the voice entered to a microphone is recognized and converted to character codes, then described in a blank field on the message slip corresponding to the button.
The character data, the result of recognition, and the character codes described above are all put together into a character string. This character string can be divided into a plurality of words. In those related art technologies described above, the relationship among the words of the character string, that is, the meaning of each word is not analyzed yet. If any of those related art voice recognition technologies is used as is for entering the above schedule data, therefore, the user must read the character string and grasp the meaning of each word thereby to specify each word appropriately to represent the object date and time, as well as other schedule items from the character string based on the meaning of each word. This is why the user may make a mistake in recognition of the meaning and specification of words. In the case of the voice recognition technologies in the second and third related arts used as described above, the user must hold down the recognition button or any one of the plural buttons just when an object voice is spoken. And, this will make it difficult to operate and easy to make errors. In addition, the user must judge whether to recognize the voice by himself/herself, and accordingly, the user will make errors in such a judgment.
In other words, the user must make an operation any way to select and use part of a voice as, for example, attribute data according to the meaning of each word used as an index in those voice recognition technologies. Consequently, word input errors will occur due to various operation errors caused by the user himself/herself when in handling.
There is also a case that both voices and character strings obtained by the first to third related art voice recognition technologies are used in the data processing apparatus collectively without considering the meaning of each word. This data processing apparatus, which is, for example, a computer that executes an application software, includes those which correspond any date and time to a voice and a character string as attribute data just like a data processing apparatus for schedule management.
In those voice recognition technologies, when a voice is obtained, neither date nor time is stored. No date and time thus correspond to any of voices and character strings. When a voice and a character string are used in the data processing apparatus, therefore, the data processing apparatus must obtain the corresponding date and time by itself. The number of processing steps is thus increased in such a data processing apparatus. There is also another case that an accumulated voice that has neither date nor time as a voice stored in the memory of an automatic phone-answering machine, that is, a voice stored on an unknown date and time, is used in this data processing apparatus. In such a case, since it is very difficult for the data processing apparatus to guess the date and time on which the voice is stored, it is difficult to use the voice.