One purpose of the present invention is to enable small children in the age range of about 3-7 to communicate and interact with a computer under a substantially nonstructured format. The child is not given detailed instructions on using the computer but, instead, he or she is allowed to explore and discover how to operate it. For a young child to receive an appropriate level of feedback from the computer, the computer should talk to the child and provide it with spoken feedback as well as the usual visual output of the computer.
The types of computers that are typically available to small children are small computers with limited memory. Thus, in order to store the needed voice responses in a small computer memory, a simple and efficient voice compression method is needed. For this particular application, so long as the quality of the voice reproduction is good, the primary consideration for this application is efficient processing and utilization of a minimum amount of memory for storing both the program and the data.
The present invention takes advantage of the smooth transitions and repetitive nature of most speech. The rate of change of most speech is relatively slow so that when it is sampled at a frequency of 10,000 samples per second, the difference jumps between each of the samples are relatively small. In most cases, the voice data may be sampled and desensitized until the majority of the jumps have an absolute value of two or less, and yet the quality of the data will remain sufficiently high so that the voice may be reconstructed and easily understood by a child. By measuring and storing the jumps between data points, the voice may be compressed. In the preferred embodiment, a single byte is used to store three jumps by calculating a single compression number from the three jumps. In the decompression mode, the single compression number is used to reconstruct the value of the three jumps and they are used to reconstruct the voice. In order for a single compression number to represent three jumps, each jump must be within a set range. In a group of three, if one or two jumps falls outside the range, the compression number when decoded (decompressed) will reflect which of the jumps were out of range. The values of these jumps are then stored immediately after the compression number in the compressed data.
If all three jumps in a group fall outside a first selected range but within a second selected range, a special code number is stored to indicate this fact, but instead of storing three numbers after the code number which would reflect the actual value of the three jumps, a two byte compression word is stored after the compression number. This compression word requires two bytes, but jumps outside of the first selected range may be coded into this word. For example, compression numbers can identify three jumps that fall within a range of +2 to -2, while a compression word can identify three jumps that fall within a range of +22 to +3 and -3 to -22. The range of +2 to -2 is not identified by the compression words since this range is covered by compression numbers. After compression, using compression numbers and compression words, the compressed data is again compressed by looking for periods of time in which there was no change in the voice signal. In such cases, the jumps would equal zero and a compression number of "56 H" (H indicates hexadecimal) would indicate three zero jumps. If two or more compression numbers of "56 H" are found, they are replaced by the code value "DA H" to indicate the repetition of the number "56 H" and data following "DA H" would indicate how many compression numbers of a value "56 H" were found in the compressed data string.