A low information content signal is one in which if divided into parts, the size of each part determined by the nature of the signal, an error in the content of a single part or group of parts will not corrupt the signal so badly that it is not useful for its original purpose. This is in contrast to a high information content signal, such as a digital representation of a character, in which the change of one bit would alter the information of the representation in such a way that it would no longer be useful for its original purpose. A low information content signal; voice, sound or other, can be represented by a stream of samples. Each sample represents an amplitude value corresponding to the amplitude of the signal waveform over a known period of time. The time period represented by each sample is determined by the frequency at which the samples are taken. The value of each sample can then be stored and used to approximately reproduce the signal waveform. The smaller the sample period, the more samples that are used to represent the signal over the time period. This enables voice, sound or other information to be input and stored as a plurality of sample values which can be reproduced at a later time and used to reconstruct the original stream of information.
Generally, speech can be divided into two broad categories, voiced and unvoiced. The voiced sounds are products of larynx and vocal tract resonances which, interacting, form a series of frequency components called formants. FIG. 1 illustrates a plot of an energy versus frequency spectrum typical of the formants for a vowel. The important point to note in this figure is that the greatest energy is at the lowest or first formant. Therefore, when this component of speech is sampled the energy change between samples will be small compared to the energy per sample. The voiced sounds are vowels and voiced consonants such as `z` and `sh`. Most of speech energy and duration (excluding silences) is in the voiced components of speech.
The unvoiced part of speech consists of the combinations of plosives, stops, fricatives and silences that make up the unvoiced consonants. These unvoiced sounds, compared to the voiced sounds, are characterized by lower energy, higher frequency signals which are quasi-periodic to noisy and of short to moderate duration. The energy change between samples relative to the samples themselves is large in this component, but the samples themselves are generally small compared to the voiced components, therefore the actual changes per sample are comparable to the voiced component. FIG. 2 illustrates the spectra of a stop consonant plus vowel combination such as `ba` or `ka`. The vertical axis is frequency and the horizontal axis is time, while the density of the spectra indicates energy at that point. The noisy, low energy section from the time point -0.1 to the time point 0, represented by the reference numeral 21, is the breathy onset to the consonant noise burst at the time point 0. After the burst, from the time point 0 to the time point 0.1, represented by the reference numeral 22, the consonant's energy falls off in a quasi-periodic way to a brief silence. From the time point 0.1 to the time point 0.5, represented by the reference numeral 23, is the vowel portion. This section is very periodic with most of the energy at the lower frequencies.
FIG. 3 illustrates the spectra of an unvoiced fricative plus vowel combination such as `sa`. From the time point -0.2 to the time point 0, represented by the reference numeral 31, is the consonant portion characterized by a high frequency, low energy noise component over a lower frequency, higher energy quasi-periodic component. The vowel portion, from the time period 0 to the time period 0.3, represented by the reference numeral 32, is, as above, periodic with most of the energy in the lower frequency component. These spectra also illustrate that locally, within a time frame short in comparison to the speech component, the changes in energy or amplitude are similar.
A sampled stream of data is illustrated in FIG. 4. The waveform 1 which represents this stream of data is comprised of a number of samples, each having an amplitude value and representing a fixed period of time. Each sample 2 is an impulse containing an energy level which is represented by the amplitude of the sample. The amplitude of each sample is determined from its height above the zero or bottom line 3. The waveform is typically centered around the reference line 4, which usually represents an analog ground level, but can be determined to represent any level. The advantage of using analog ground as the reference level 4 is that if the waveform 1 travels both above and below ground, a positive amplitude can be used to represent amplitudes which are both above and below the ground or reference level 4. For example, if the reference level 4 is set to equal a sample having an amplitude of 100, any sample having an amplitude greater than 100 will be above ground level and any sample having an amplitude less than 100 will be below ground level. The reference level 4 can be determined and programmed to represent any level, depending on the application. The difference between the reference level 4 and the zero line 3 must be great enough to accommodate the amplitude level which will be farthest below the reference level 4 for the specific application.
Low information content signals include signals or data which represent such things as voice, music, sound, handwriting, and are sampled in such a way that the information content per sample is not critical to the information content of the overall sampled signal. The criteria used to determine whether or not the information content of a sample is critical is generally a function of the ratio of the sample period to the minimum amount of signal or data required to produce meaningful information.
A voice signal sampled according to the Niquist criteria is one type of a low information content signal. The Niquist theorem provides that a signal must be sampled at a rate at least twice the signal's highest frequency to prevent aliasing. Thus, if the maximum voice frequency was limited to 4 KHz, the sample frequency would be 8 KHz and each sample would represent a time period of 125 microseconds. To be generally recognizable as a voice segment, a signal of at least 100 milliseconds is required to constitute meaningful information. Therefore, the ratio of the sample period to the minimum amount of signal required for meaningful information is equal to 0.00125. In such a case, no isolated sample or samples is critical to the information content of the signal segment.
All meaningful aurally processed data can generally be represented by a low information content signal as described above. This is true because the quality of voice, music and sound reproductions is judged by the human ear, an imprecise instrument. Because of the limitations of the human ear, it is not necessary that each individual sample be reproduced at precisely the level of the original. But rather, all that is required is that enough of the amplitude of individual samples is produced or reproduced so that the human ear cannot detect a difference between the original and the produced or reproduced stream of sound and that the audibility of any errors is reduced to within the requirements of the specific application.
Storing a representation of a stream of sound can be accomplished within an integrated circuit (IC) memory, including but not limited to EEPROM, EPROM, ROM or RAM array, with each sample stored as an amplitude value within a cell or cells of the storage device. This stream of sound can be stored in an EEPROM or an EPROM as an analog amplitude. The reproduction quality of a stream of sound stored in a storage device is in part a function of the quality of the device used for storage. If the device used for storage contains bad or failing cells, when the stream of sound is reproduced from the stored data, the bad or failing cells will provide erroneous sample amplitudes which will unfavorably effect the reproduction of the stream of sound.
A typical application for such a storage device is recording a voice or sound message of a predetermined duration for playback at a later time. During record mode the storage device receives the voice message, samples it and stores the amplitudes of the samples so that the voice message can later be reproduced. During playback mode, when the user desires to listen to the voice message, the sample amplitudes are retrieved from the storage device and used to reconstruct the voice message. If any of the cells storing a sample amplitude have failed, then the voice message will not be reproduced accurately and may contain unwanted, extraneous noise.
Cell failure within a storage device can be caused by weak programming, leakage, shorts to a supply voltage level or a neighboring cell, a floating control gate, a shorted floating gate or other well known causes. Such a cell failure will keep a cell from programming to the amplitude of the sample taken. All of the causes of cell failure are not equally catastrophic in an analog storage device, so that determining what is to be a bad cell is typically done subjectively by correlating various listening and waveform tests with strobe levels in a test program. The number of failing cells allowed in a particular storage device is then a function of the yield necessary to insure adequate margins versus the sound quality level required by the market for the specific application in which the storage device is to be used.
Depending on the yield necessary and the sound quality requirements for the specific application in which the storage device is to be utilized, the manufacturer can determine how many bad cells within a storage device can be tolerated. Any storage devices having more than the allowed number of bad cells are discarded. The manufacturer can also sort the storage devices by the number of bad cells that they contain, so that the storage devices with the least number of bad cells can be used for applications requiring the highest quality and the storage devices with a higher number of bad cells will be used for applications where a lower level of quality can be tolerated. Storage devices with a high enough number of failed cells will be discarded.
The sound quality requirements of the market will generally increase with an increase in recording time, but to maintain adequate yields the number of bad cells which are allowed must increase linearly with the size of the storage device. If three failed cells are allowed on a storage device used to record a fifteen second stream of sound, then twelve failed cells will be allowed on a storage device used to record a sixty second stream of sound. Short duration devices are often used in novelty applications where the market does not require as high a level of quality. Longer duration devices are generally intended for a high repeat usage market, such as a message collector for cellular phones, where the market demands a higher level of quality.
A failing cell within a storage device can be determined by many test methods. One such test method is to program all of the cells in the storage device to a quiet level, represented by analog ground. The information in the storage device can then be listened to in the playback mode and any failing cells which produce an audible discrepancy can be detected. A failing cell will have a level different than the DC background by an amount determined to be unacceptably audible by the listening tests. This value can be considered the minimum audible change in amplitude level which is unacceptable. A diagram of such a failing cell is illustrated in FIG. 5. The amplitude of the failing cell 50 is much higher than the amplitude of the other cells, which have been programmed to a ground level. The change in level of the failing cell 50 is essentially an impulse and therefore it contains energy at all frequencies which is spread in time by output amplification and filtering. The larger the change in level, the greater the spread in time. If the change in level is high enough, an audible `pop` can be heard by a listener, introducing an error into the stream of sound output from the storage device.
Other tests used to verify the functionality and quality of these storage devices. These tests determine whether each cell can be programmed to a sufficient level so that a voice or sound message can be reproduced to the level of quality demanded by the particular application in which the storage device is to be used and that no extraneous noise is introduced by a failed cell.
What is needed is a method and apparatus for recognizing and correcting any errors appearing during reproduction of a low information content signal so that circuits containing defective cells can still be used acceptably. What is further needed is a method in which the quality of the voice or sound reproduction from a storage device with failing cells can be improved so that storage devices which contain a high number of failing cells can be saved from being discarded and can be used to record and playback a stream of sound. What is also needed is a method in which the quality of the voice or sound reproduction from a storage device which contains a previously acceptable number of failing cells can be improved.