When utilizing voice synthesis, it is necessary to listen to actual voices in order to select a speaker (voice dictionary). It is therefore difficult to select the speaker from among a large number of candidates. In the selection of the speaker which has been provided conventionally, there have been no more than about 10 kinds of options for the speaker. In recent years, however, 800 or more kinds of voice dictionaries have been provided. Therefore, a means to designate an attribute (e.g. gender, age group, attribute (cool/husky/moe, which means “extremely adorable” in Japanese)) to search for a speaker has been provided as a means to select a speaker. In another technique, when a voice dictionary of a speaker designated by metadata of a text does not exist in a reproduction environment, an alternative voice is selected based on an attribute (same as the above-mentioned attribute) described in the metadata, and the selected voice is reproduced.
In a method to designate an attribute to search for a speaker, however, it is difficult for a user to appropriately set an attribute of a speaker suitable for reading an input text. In a case where there are a large number of voice dictionaries, and many candidates for the speaker are presented even as the result of the attribute search, it might be difficult to narrow down these candidates.
In order to solve the above-mentioned problems and achieve an object, an embodiment of the present invention includes: an acceptance unit that accepts input of a text; an analysis knowledge storage unit that stores therein text analysis knowledge to be used for characteristic analysis for the input text; an analysis unit that analyzes a characteristic of the text by referring to the text analysis knowledge; a voice attribute storage unit that stores therein a voice attribute of each voice dictionary; an evaluation unit that evaluates similarity between the voice attribute of the voice dictionary and the characteristic of the text; and a candidate presentation unit that presents, based on the similarity, a candidate for the voice dictionary suitable for the text.