The present invention relates generally to audio (voice) processing, and more particularly, to a synchronized-overlap-add technique using one bit correlation and windowing that may be used in audio processing and audio compression systems.
Changing the time scale of a voice signal can be done at the cost of changing the pitch by simply speeding up playback of the signal. For a digitized signal, speed-up involves increasing the sample rate on play-back. As the sample rate is increased, the pitch frequency of the voice signals increases. At the extreme, the pitch is high enough to have a "chipmunk" quality.
A technique for maintaining pitch while changing the time scale is a synchronized overlap-add technique. The voice signal is segmented into blocks. Overlapping the next block with a previous block and adding the new block to the old block reduces the time scale of the voice signal, speeding up the signal for a constant sample rate.
This simple approach has problems because the voice signal does not match with a random overlap. Hence, the signal is "synchronized" with the old block before adding it the new signal. The new block is shifted in time until the signal has a high correlation with the existing block. With this displacement, the new signal can be overlapped and added to the old signal block and still maintain the signal through the transition without possible harmful destructive interference. The two signals are added coherently, instead of randomly.
One of the effects of synchronized overlap add processing is suppression of random noise. Noise that is not correlated with the voice signal is added incoherently and is suppressed. The larger the overlap, the more times the voice signal will be added and the more the noise is suppressed.
The time scale may be expanded as well a contracted. Overlapped blocks of the voice signal may be shifted in time to be farther apart as well as closer together. Synchronization of the voice signal is necessary on expansion of the signal as well as on the contraction of the signal. If a signal is first contracted, then expanded, the voice signal at its original time scale can be reconstructed. The reconstructed voice signal will have its noise suppressed, depending on the number of times that the voice signal has been added to a synchronous version of itself in the process of contraction and re-expansion.
A very simple voice compression technique uses the synchronized overlap-add technique to contract the signal, compressing the signal. This is disclosed in U.S. Pat. No. 5,353,374 entitled "Low Bit Rate Voice Transmission for Use in a Noisy Environment", issued Oct. 4, 1994 and assigned to the assignee of the present invention. In accordance with the teachings of this patent, the compressed signal is transmitted, then re-expanded. Compression due to synchronized overlap-add processing of more than four to one has been demonstrated. With further compression using information coding techniques, compression of another factor of four is possible. The result can be a compressed voice signal with data rates less that 4 kilobits per second. With silence suppression, the average data rate can be less than 2 kilobits per second.
In the past, synchronized overlap-add processing has been accomplished by segmenting the voice signal into blocks, then performing the correlation of the blocks directly. The process requires that one block be shifted with respect to the other and the two signals multiplied point by point and the products added together. This is disclosed in an article by J. L. Wayman and D. L. Wilson entitled "Some improvements on the synchronized-overlapped method of time-domain modification for real-time speech compression and noise filtering", IEEE Journal on Acoust. Speech and Signal Proc., Vol. 36, 1988 pp. 139-140, and in U.S. Pat. No. 5,353,374 cited above. The number of required multiply-adds is the number of points that overlap times the number of different shifts in time that are to be tested. This number can be as many as 100 times the number of samples in a block.
A computerized search was performed to investigate prior art patents relating to the present invention. A number of patents were uncovered and are discussed below.
U.S. Pat. No. 5,630,013 entitled "Method of and apparatus for performing timescale modification of speech signals", issued to Suzuki et al, and dated May 13, 1997 outlines a technique for time-scale modification that is part of the substance of my patent cited above. This patent discloses fulll correlation and time delayed windowing.
U.S. Pat. No. 5,175,769 entitled "Method for time-scale modification of signals", issued to Hejna, et al. and dated Dec. 29, 1992 discloses the same square windows and full correlation discussed in Suzuki's patent above and my original patent.
U.S. Pat. No. 5,479,564 entitled "Method and apparatus for manipulating-pitch and/or duration of a signal", issued to Vogten et al, and dated Dec. 26, 1995 discloses finding the peaks of the pitch period and using these times for placing the windows of the overlap add.
U.S. Pat. No. 4,864,620 entitled "Method for performing the time-scale modification of speech information or speech signals", issued to Bialick, and dated Sept. 5, 1989 discloses a scheme similar to that of my patent using square windows or "frames". An "Average Magnitude Difference Function" is used in the correlation process such that no multiplication or division is required. Smooth transitions are achieved by applying a graduated weighting.
U.S. Pat. No. 5,355,363 entitled "Voice transmission method and apparatus in duplex radio system", issued to Takahashi, et al. and dated Oct. 11, 1994 discloses the use of time scale modification to compress a transmitted signal into segments that can be transmitted with gaps during which a receiver can receive the return side signal similarly compressed.
U.S. Pat. No. 4,064,481 entitled "Vibrator and processing systems for vibratory seismic operation", issued to Silverman, and dated Dec. 20, 1977 discloses the use of one bit correlation in processing of a chirped seismic signal.
The present invention relates to one bit correlation to locate matching times in a signal and a synchronized overlap add signal that is constructed. After correlation to find the matching time, the signal is windowed with a smooth window and added to the synchronized overlap add signal. The patents discussed above use windows, typically applied before the synchronization is performed. The windows are typically square windows, although U.S. Pat. No. 4,864,620 discloses the use of some type of smooth windowing.
The only patent that mentions one-bit correlation is U.S. Pat. No. 4,064,481 which relates to an entirely different application, seismic signal processing, and does not teach using one-bit correlation for use in time-scale modification.
It would therefore be desirable to have an improved audio (voice) processing system and method that uses a synchronized-overlap-add technique with one bit correlation and windowing, and that overcome limitations of conventional approaches. Accordingly, it is an objective of the present invention to provide for a audio processing system and method that uses an improved synchronized-overlap-add technique with one bit correlation and windowing.