The present invention relates generally to cognitive computing systems, and more specifically, to the automated cognitive recording and organization of speech as structured text using cognitive computing systems.
Programmed computers have been used to convert speech to readable text. In a known approach, the computer initially uses an analog-to-digital converter (ADC) to translate the vibrations caused by speech into digital data that the computer can understand. The computer divides the digital data into small segments (e.g., a few hundredths of a second), and then matches these segments to known phonemes in the appropriate language. In general, a phoneme is a representation of the sounds humans make and put together to form meaningful expressions. There are roughly 40 phonemes in the English language, while other languages have more or fewer phonemes. The computer then examines phonemes in the context of the other phonemes around them, generates a contextual phoneme plot, runs the plot through a complex statistical model, and compares the results to a large library of known words, phrases and sentences. The computer then determines what the speaker was likely saying and outputs the speech as readable text.