1. Technical Field
The present disclosure relates to creating voice profiles for specific demographics and more specifically to acquiring voice parameterizations by extracting acoustic features from human speech data found on the Internet, such as webcasts, videos, and podcasts, then correlating the acoustic features with demographic data of a speaker for delivery to a user.
2. Introduction
Synthetic speech is often produced using a generic set of pre-recorded voices. However, this can result in misunderstandings when the user is not able to understand the synthetic speech due to accent mismatch or due to a preference mismatch. For example, a British user might not understand words or accents used by an American-sounding synthetic voice. While certain systems allow the user to change the voice or accent produced, such preferences can be cumbersome to define and may not have the particular language, accent, or other preferences the user desires in the synthetic voice.