1. Field of the Invention
The present invention relates to pitch shifters and, more specifically, to a pitch shifter for shifting an acoustic signal in pitch to an arbitrary level.
2. Description of the Background Art
Pitch is a sense of sound, which means the value of frequency. A pitch shifter is a device for shifting an acoustic signal in pitch to a desired level. One well-known example of such pitch shifter is a key controller provided in a karaoke CD (compact disk) player or the like.
FIGS. 16a to 16c are diagrams in assistance of explaining the principle of shifting an acoustic signal in pitch to a desired level.
As shown in FIGS. 16a to 16c, an original acoustic signal shown in FIG. 16a is compressed to become an acoustic signal as shown in FIG. 16b of higher frequencies and pitch, and is extended to become an acoustic signal as shown in FIG. 16c of lower frequencies and pitch.
For example, if the acoustic signal is compressed to half along the time axis, the acoustic signal becomes double in frequency, and thus increases in pitch by one octave. On the other hand, if the acoustic signal is extended double along the time axis, the acoustic signal becomes half in frequency, and thus decreases in pitch by one octave.
In general, if the acoustic signal is compressed or extended by k.sup.-1 (where 0&lt;k, and 1&lt;k for compression, 0&lt;k&lt;1 for extension) along the time axis, the acoustic signal becomes shifted in frequency by k, and thus in pitch by (log.sub.2 k) octave.
Hereinafter, the above-stated k representing a ratio in pitch of the original acoustic signal to the shifted acoustic signal is referred to as a "pitch shift ratio".
As such, by compressing or extending the acoustic signal along the time axis by k.sup.-1, the acoustic signal can be changed in frequency by k. Such compression or extension, however, also changes a time length (reproduction time) of the acoustic signal by k.sup.-1 if no other measures is taken together. Therefore, so-called "crossfading" is further carried out on the acoustic signal to prevent changes in time length.
FIG. 17 is a diagram in assistance for explaining the principle of a crossfading process for smoothly connecting two insuccessive sound frames.
As shown in FIG. 17, consider a case in which a frame B is deleted, and a frame A and a frame C are connected together. In this case, if the frame A and the frame C are connected without any change, discontinuity occurs in signal value at their connecting point, and therefore noise may occur at signal reproduction.
Thus, these frames are connected together with the frame A being faded-out and the frame C being faded-in. Thus, continuity is kept in signal value at their connecting point, and therefore noise is prevented at signal reproduction.
However, if the frame A and the frame C are connected together by crossfading, reproduction time is shortened compared with the case where these frames are connected together without any change. Therefore, a combination of compression/extension along the time axis and crossfading enables shift in pitch of the acoustic signal without any other change.
FIGS. 18a and 18b are diagrams in assistance for explaining the principle of shifting the acoustic signal in pitch without any change in reproduction time. FIG. 18a shows a case in which a signal is increased in pitch, that is, compressed along the time axis (time-axis compression). FIG. 18b shows a case in which a signal is decreased in pitch, that is, extended along the time axis (time-axis extension).
In FIGS. 18a and 18b, a time length of a frame after time-axis compression/extension, that is, an output frame length, is first determined. Then, an input frame length based on the pitch shift ratio is determined. Here, assume that the pitch is multiplied by k, the output frame length is 2, and the input frame length is 2 k.
Next, input frames of each frame length "2k" are sequentially extracted from the original signal as successive two frames overlap each other. The length of an overlapping part is (2k-1). In FIGS. 18a and 18b, three input frames represented by A1 and B2, A2 and B3, and A3 and B4, respectively, are shown.
Next, each extracted input frame is compressed/extended by k.sup.-1 along the time axis with reference to the head of each frame (alternatively, with reference to the midpoint or end thereof). Thus, output frames of each frame length "2" can be produced. Among the output frames, successive two output frames overlap each other in half of each frame length.
Specifically, in FIG. 18a, (A1H and B2H), (A2H and B3H) and (A3H and B4H) are the output frames, and (B2H and A2H), (B3H and A3H) are the overlapping parts. In FIG. 18b, (A1L and B2L), (A2L and B3L), (A3L and B4L) are the output frames, and (B2L and A2L), and (B3L and A3L) are the overlapping parts.
Next, all these output frames are connected together by crossfading. The crossfading process may be carried out over the whole or part of the overlapping parts.
In FIG. 18a, two cases are shown, one in which the crossfading process is carried out over the whole of the overlapping parts B2H and A2H, and B3H and A3H, and the other over 25% thereof. Also in FIG. 18b, two cases are shown, one in which the crossfading process is carried out over the whole (that is, 100%) of the overlapping parts B2L and A2L, and B3L and A3L, and the other over 25% thereof.
Thus, the acoustic signal can be changed in frequency by k times while being unchanged in reproduction time.
Described below is a conventional pitch shifter for carrying out a pitch shifting process on discrete sound data through crossfading compression/extension.
FIG. 19 is a block diagram showing one example of structure of the conventional pitch shifter. FIG. 20 is a block diagram showing one example of structure of a conventional CD player equipped with the pitch shifter of FIG. 19.
In FIG. 20, a CD 20 has discrete sound data {x(0), x(1), x(2), x(3), . . . } produced by sampling an acoustic signal in every predetermined cycle T and recorded thereon in advance. The CD player includes a reader 21, a reproducer 22, a sound pitch shift ratio setting unit 23, a pitch control signal generator 24, and a sound data output terminal 25, a pitch control signal output terminal 26, and a sound data input terminal 27.
The pitch shift ratio setting unit 23 includes a selector for selecting any of a plurality of predetermined pitch shift ratios or an adjustment control for specifying an arbitrary pitch shift ratio. The pitch shift ratio setting unit 23 sets the pitch shift ratio selected or arbitrarily specified by a user in the CD player. The pitch control signal generator 24 generates a pitch control signal indicating the pitch shift ratio set by the pitch shift ratio setting unit 23. The pitch control signal generated by the pitch control signal generator 24 is outputted from the pitch control signal output terminal 26.
The reader 21 sequentially reads sound the data from the CD 20. The sound data read by the reader 21 is sequentially outputted from the sound data output terminal 25 in every cycle T.
The pitch shifter receives the sound data {x(0), x(1), x(2) x(3), . . .} sequentially outputted from the sound data output terminal 25 and the pitch control signal outputted from the pitch control signal output terminal 26, and then sequentially produces sound data after shifted in pitch {out(0), out(1), out(2), out(3), . . . } in the cycle T.
The sound data after shifted in pitch sequentially produced by the pitch shifter is outputted from the sound data input terminal 27. The reproducer 22 receives the sound data after shifted in pitch {out(0), out(1), out (2), out(3), . . . } outputted from the sound data input terminal 27, and reproduces the acoustic signal. The acoustic signal reproduced by the reproducer 22 is amplified by an amplifier not shown, and then provided to a speaker.
In FIG. 19, the conventional pitch shifter includes memory unit 1, paired read address generators 4a and 4b that are identical in structure, paired interpolators 10a and 10b, a crossfader 3, a sound data input terminal 7, a sound data output terminal 8, and a pitch control signal input terminal 9.
The sound data {x(0), x(1), x(2), x(3), . . . } outputted from the sound data output terminal 25 of the CD player is provided to the sound data input terminal 7. The memory unit 1 temporarily stores the sound data.
The pitch control signal outputted from the pitch control signal output terminal 26 is provided to the pitch control signal input terminal 9. The read address generators 4a and 4b each generate, based on the pitch control signal, a read address for reading the sound data temporarily stored in the memory unit 1. That is, the pitch shift ratio indicated by the pitch control signal is accumulated as an address increment value, and the accumulation result is outputted as a read address.
FIG. 21 is a block diagram showing one example of structure of the read address generator 4a or 4b of FIG. 19.
In FIG. 21, either the read address generator 4a or 4b includes an accumulator (ALU) 16 for accumulating the address increment value (=k). An example of such structured address generator is disclosed in Japanese Patent Laid-Open Publication No. 9-212193 (1997-212193).
Thus, the address generator produces, for example, {0, 1, 2, 3, . . . } if the pitch shift ratio k is 1 (no pitch shift ) , and {0, 2, 4, 6, . . . } if k=2. Also, the address generator produces, for example, {0, 0.5, 1, 1.5, . . . } if k=0.5, and {0, 1.26, 2.52, 3.78, . . . } if k=1.26.
Note that the read address generators 4a and 4b generate addresses differed from each other by a predetermined value.
For example, if {0, 1, 2, 3, 4, . . . } is generated by one address generator, {4, 5, 6, 7, 8, . . . } is generated by the other address generator. In other words, a set of read addresses (0, 4) is generated at a certain time; another set of read addresses (1, 5) is generated after the time T has elapsed from the certain time; still another set of read addresses (2, 6) is generated after the time T has elapsed, and the process continues in the same manner.
The difference between these two read addresses is determined based on the output frame length, pitch shift ratio (refer to FIGS. 18a and 18b), and other factors. How to determine the difference is not directly related to the present invention, and therefore is not described herein.
Referring back to FIG. 19, the memory unit 1 reads the sound data stored in advance, based on the read addresses generated by the read address generators 4a and 4b. For example, if the pitch shift ratio is doubled, the read address generator 4a generates a read address {0, 2, 4, . . . }, and the memory unit 1 sequentially reads the sound data {x(0), x(2), X(4), . . . } in the cycle T. In such manner, 1/2 compression in time axis is carried out.
In other words, in the conventional pitch shifter, the memory unit 1 and the read address generators 4a and 4b achieve the above described compression/extension in time axis.
However, for example, if the pitch shift ratio is 1.26, a read address {0, 1.26.times.1, 1.26.times.2, . . . } is generated, but sound data such as x(1.26.times.1) and x(1.26.times.2) does not exist in the memory unit 1. Therefore, to achieve an arbitrary pitch shift ratio, interpolators 10a and 10b for calculating interpolation values from the sound data stored in the memory unit 1 are further required. The interpolator 10a generates an interpolation value based on the read address generated by the read address generator 4a and the sound data read from the memory unit 1 based on the generated address. The interpolator 10b generates an interpolation value based on the read address generated by the read address generator 4b and the sound data read from the memory unit 1 based on the generated interpolation value. Note that if the pitch shift ratio is an integer, that is, does not have any valid decimal part, no interpolation data is required.
With these interpolators 10a and 10b further provided, the pitch shifter can carry out compression/extension in time axis even if the pitch shift ratio has a decimal part. In other words, the acoustic signal can be shifted in pitch to an arbitrary level.
The crossfader 3 receives interpolated sound data outputted from the interpolator 10a and interpolated sound data outputted from the interpolator 10b, and carries out crossfading thereon. That is, each sound data is multiplied by a crossfading coefficient (which will described later), and then added together.
With such crossfader 3 further provided, the pitch shifter can shift an acoustic signal in pitch to an arbitrary level without any change in reproduction time.
From the sound data output terminal 8, sound data after subjected to crossfading compression/extension, that is, sound data shifted in pitch, is outputted.
The operations of the above-structured CD player and the conventional pitch shifter provided therein are described below.
In FIG. 20, the user first specifies, through an adjustment control not shown, a desired pitch shift ratio k, and then presses a PLAY button (not shown) provided thereon.
In response, in the CD player, the pitch shift ratio setting unit 23 first sets the pitch shift ratio k therein. Then, the reader 21 starts to read the sound data from the CD 20 in the cycle T. Also, the pitch shift ratio setting unit 23 starts to generate a pitch control signal indicating the pitch shift ratio k. Note that the pitch shift ratio k set in the above manner may be shifted to another value after the start of reproduction.
Thus read sound data and the generated pitch control signal are provided to the conventional pitch shifter through the sound data input terminal 7 and the sound control signal input terminal 9, respectively.
In FIG. 19, the provided input data is temporarily stored in the memory unit 1.
FIGS. 22a, 22b, and 22c are diagrams showing at a glance a pitch shifting process carried out by the pitch shifter of FIG. 19.
FIG. 22a is a diagram showing at a glance how the memory unit 1 of FIG. 11 stores sound data.
In FIG. 22a, x(0), x(1), x(2), . . . each are sound data. The horizontal axis represents real time t in units of the sampling cycle T, and also represents addresses on a buffer in the memory unit 1. A signal value of each sound data is represented by a distance from the horizontal axis.
As shown in FIG. 22a, the memory unit 1 stores the inputted sound data in sequence such as x(0) in address 0, x(1) in address 1, and x(2) in address 2.
On the other hand, the inputted pitch control signal is branched into two, and given to the read address generators 4a and 4b. Based on the given sound control signal, the read address generators 4a and 4b each generate a read address differed from each other by a predetermined value in the cycle T.
The generated paired read addresses are given to the memory unit 1 and the interpolators 10a and 10b. The memory unit 1 reads the sound data stored in advance (refer to FIG. 22a), based on the given paired read addresses.
FIG. 23 is a diagram showing a relation, on the buffer in the memory unit 1 of FIG. 19, between a write position of the inputted sound data and read positions of the sound data written in advance based on the addresses from the paired read address generators 4a and 4b, where the pitch is shifted to higher.
In FIG. 23, "w" is a write address pointer indicating a position on the buffer to which the sound data is written. "r1" and "r2" are read address pointers each indicating a position on the memory unit corresponding to the address from the address generator, that is, a position on the buffer from which the sound data is read based on the address.
Here, with reference to FIG. 23, described is how the memory unit 1 writes the inputted sound data in the buffer and reads the sound data from the buffer based on the given paired read addresses.
First, as shown in a top portion of FIG. 23, "r1" is located in a rearward position from "w" for a predetermined distance d, while "r2" is located in a rearward position from "r1" for the distance d. Here, a direction in which the pointer proceeds is a forward direction. After writing/reading starts, "r1" proceeds faster than "w", and "r2" proceeds as fast as "r1". Then, when "r1" catches up with "w", "r1" jumps to a rearward position from "r2" for the distance d.
The loci of "r1" and "r2" correspond to the area B2 and the area A2 shown in FIG. 18a, respectively.
Immediately after the jump of "r1", as shown in a middle portion of FIG. 23, "r2" is located in a rearward position from "w" for the distance d, while "r1" is located at a rearward position from "r2" for the distance d. Then, "r2" proceeds faster than "w", and "r1" proceeds as fast as "r2". Then, when "r2" catches up with "w", "r2" jumps to a rearward position from "r1" for the distance d.
The loci of "r2" and "r1" correspond to the area B3 and the area A3, respectively.
Immediately after the jump of "r2", as shown in a bottom portion of FIG. 23, "r1" is located at a rearward position from "w" for the distanced, while "r2" is located at a rearward position from "r1" for the distance d. Thereafter, "w", "r1", and "r2" each move in the same manner as described above.
Referring back to FIG. 19, if the read address generated by the address generator does not represent an integer, in parallel with the above writing/reading, that is, compression/extension in time axis, an interpolation process is carried out by the memory unit 1 and the interpolators 10a and 10b. This interpolation process is described below.
If the read address represents an integer (that is, does not have any valid decimal part), the memory unit 1 reads the sound data stored in an address corresponding to the read address. However, if the read address has any valid decimal part, the memory unit 1 reads two pieces of sound data stored in addresses adjacent to the read address, that is, addresses immediately preceding and succeeding the read address.
Therefore, for example, if the read address represents 0, single sound data x(0) is read. If 0.5, two pieces of sound data x(0) and x(1) are read. Similar, if 1.26, two pieces of sound data x(1) and x(2) are read.
The sound data read based on the address generated by the read address generator 4a is given to the interpolator 10a. The sound data based on the address generated by the read address generator 4b is given to the interpolator 10b.
The interpolators 10a and 10b each calculate an interpolation value based on the given sound data and read address, and produces interpolated sound data.
In other words, the interpolators 10a and 10b each output single sound data given by the memory unit 1 as the interpolated sound data if the read address does not have any decimal part. If the read address has any decimal part, the interpolators 10a and 10b each calculate an interpolation value based on that decimal part and the signal values of two pieces of sound data given by the memory unit 1, and then each produce the interpolation value as the interpolated sound data.
Calculation of the interpolation value is performed typically by so-called "linear interpolation".
FIG. 22b is a diagram showing, at a glance, linear interpolation performed by the interpolator 10a and 10b, where the pitch shift ratio k is 1.26.
In FIG. 22b, x(0), x(1), x(2), . . . each are the sound data stored in the memory unit 1, and y(1.26), y(1.26.times.2), . . . are the interpolation values.
As shown in FIG. 22b, if the read address is 1.26, the interpolators 10a and 10b each calculate the interpolation value y(1.26) from a decimal part 0.26 and the sound data x(1) and x(2) by using the following equation (1). EQU y(1.26)=x(1)+0.26.times.{x(2)-x(1)} (1)
Similarly, if the read address is 1.26, the interpolators 10a and 10b each calculate the interpolation value y(1.26.times.2) from a decimal part (1.26.times.2-2) and the sound data x(2) and x(3) by using the following equation (1). EQU y(1.26.times.2)=x(2)+(1.26.times.2-2).times.{x(3)-x(2)} (2)
In general, if the read address is (k.times.n) (k is pitch shift ratio, and n is an arbitrary integer), the interpolators 10a and 10b each calculate an interpolation value y(k.times.n) from a decimal part (k.times.n-m) and sound data x(m) and x(m+1) by using the following equation (3). EQU y(k.times.n)=x(m)+(k.times.n-m).times.{x(m+1)-x(m)} (3)
A pair of sound data is sequentially outputted in the cycle T from the interpolators 10a and 10b to the crossfader 3. The crossfader 3 carries out crossfading on the paired sound data.
The crossfader 3 stores in advance paired crossfading coefficients by which the paired sound data are multiplied.
FIG. 24 is a diagram showing one example of such paired crossfading coefficients by which the crossfader 3 of FIG. 19 multiplies the paired sound data.
In FIG. 24, .alpha. represents a position of sound data in frame from the head. V(.alpha.) is a crossfading coefficient by which the .alpha.-th sound data in frame from the head is multiplied. Assume the number of sound data included in one frame is .alpha..sub.0, if .alpha.=0, V(.alpha.)=0. Also, if =.alpha..sub.0 /2, V(.alpha.)=1.
The crossfader 3 detects the position of the interpolated pair of sound data in frame from the head by counting the number of interpolated paired sound data provided thereto. For example, for n.sub.1 and n.sub.2 interpolated sound data, paired V(.alpha.) corresponding to .alpha.=n.sub.1 and n.sub.2 are calculated. Then, each sound data is multiplied by its corresponding V(.alpha.) and the multiplication results are added together.
Then, the addition result, that is, the sound data after shifted in pitch, {y'(0), y'(k.times.1), y'(k.times.2), . . . } is outputted in the cycle T to the outside of the pitch shifter through the sound data output terminal 8.
The sound data after shifted in pitch {y'(0), y'(k.times.1), y'(k.times.2), . . . } outputted from the pitch shifter is again provided to the CD player through the sound data input terminal 27.
In FIG. 20, the sound data after shifted in pitch provided through the sound data input terminal 27 is given to the reproducer 22. The reproducer 22 reproduces the acoustic signal from the provided sound data after shifted in pitch.
The acoustic signal reproduced in the above-described manner is amplified through an amplifier (not shown), and then provided to the speaker, and then converted into an acoustic wave.
FIG. 22c is a diagram showing at a glance the acoustic signal reproduced from the sound data after shifted in pitch.
In FIG. 22c, {out(0), out(1), out(2), . . . } is an acoustic signal that corresponds to the sound data after shifted in pitch {y'(0), y'(k.times.1), y'(k.times.1), . . . }. The horizontal axis represents real time t by a unit of the cycle T.
As described above, in the conventional pitch shifter, the acoustic signal can be shifted in pitch through crossfading compression/extension without any change in reproduction time.
However, linear interpolation carried out at compression/extension could cause a large difference between an ideal value and the interpolation value, and thus signal distortion may occur at high frequencies.
Therefore, in order to reduce signal distortion at high frequencies, oversampling is suggested. In oversampling, a sampling frequency T.sup.-1 of sound data is shifted into a higher frequency N.sup.T-1 (where N is a power of 2). N is hereinafter referred to an oversampling ratio.
FIG. 25 is a block diagram showing the structure of another conventional pitch shifter. As the pitch shifter of FIG. 19, the pitch shifter of FIG. 25 is, for example, provided in the CD player of FIG. 20.
In FIG. 25, this pitch shifter includes the memory unit 1, the paired read address generators 4a and 4b, the paired interpolators 10a and 10b, the crossfader 3, the sound data input terminal 7, the sound data output terminal 8, the pitch control signal input terminal 9, an oversampler 11, and a downsampler 12.
In other words, the pitch shifter of FIG. 25 is similar in structure to that of FIG. 19 except the over sampler 11 and the downsampler 12 are additionally provided.
The oversampler 11 receives the sound data {x(0), x(1), x(2), . . . } through the sound data input terminal 7, and carries out oversampling on the received sound data. Note that described hereinafter is a case in which the oversampling ratio is 2.
More specifically, the oversampler 11 includes an interpolator 13 and an anti-aliasing filter (low-pass filter 14a) for eliminating aliasing. First, the oversampler 11 inserts a value of 0 between two pieces of sound data, that is, x(0) and x(1), x(1) and x(2), . . . Then, the oversampler 11 carries out a filter operation in a cycle {(1/2).times.T} based on the 0-inserted sound data {x(0), 0, x(1), 0, x(2), 0, . . . } to calculate sound data {x'(0), x'(0.5), x'(1), x'(1.5), x'(2), x'(2.5), . . . }.
The downsampler 12 receives the sound data shifted in pitch {y'(0), y'(k.times.0.5), y'(k.times.1), y'(k.times.1.5), y'(k.times.2),. y'(k.times.2.5), . . . } outputted from the crossfader 3, and carries out downsampling on the received sound data.
More specifically, the downsampler 12 includes an anti-aliasing filter (low-pass filter 14b ) having a characteristic of eliminating aliasing and a decimator 15. First, the downsampler 12 carries out a filter operation in the cycle {(1/2).times.T} based on the sound data {y'(0), y'(k.times.0.5), y'(k.times.1), y'(k.times.1.5), y'(k.times.2), . y'(k.times.2.5), . . . } to calculate sound data {y"(0), y"(k.times.0.5), y"(k.times.1), y",(k.times.1.5), y"(k.times.2),. y"(k.times.2.5), . . . }. Then, the downsampler 12 decimates {y"(k.times.0.5), y"(k.times.1.5), y"(k.times.2.5), . . . } in the sound data {y"(0), y"(k.times.0.5), y"(k.times.1.0), y"(k.times.1.5), y"(k.times.2.0), y"(k.times.2.5), . . . }.
Each of the components other than the oversampler 11 and the downsampler 12 basically carries out a similar operation to that carried out by each corresponding component of the pitch shifter shown in FIG. 19. The difference is that the operation cycle becomes half to be {(1/2).times.T}, and that the buffer in memory unit 1 has to be doubled in capacity. In general, if the oversampling ratio is N, the operation cycle is {N.sup.-1.times.T }, and the buffer in the memory unit 1 has to be increased by N times in capacity.
The pitch shifter of FIG. 25 is different in operation from that of FIG. 19 in the following two points.
First, in addition to the pitch shifting process, the oversampling process is carried out. More specifically, interpolation and filter operation are carried out before pitch shift, and filter operation and decimation are carried out after pitch shift.
Secondly, the number of sound data is increased by oversampling, and thus the amount of operation per unit time for a pitch shifting process is increased. More specifically, if the oversampling ratio is N, the operation cycle of the interpolators 10a and 10b and the crossfader 3 becomes {N.sup.-1.times.T}.
The sound data outputted from the pitch shifter of FIG. 25 is different from that from the pitch shifter of FIG. 19, which will be described below with reference to the drawings.
FIGS. 26a to 26c are diagrams showing at a glance the pitch shifting process carried out by the pitch shifter of FIG. 25.
As can been seen by comparing FIGS. 26a to 26c with FIG. 22, double oversampling reduces a time interval between two successive sound data by half. In general, if the oversampling ratio is N, the time interval is reduced by N.sup.-1. Therefore, pieces of sound data more adjacent to each other in address are used for calculating interpolation values when the read address has a decimal part. As a result, the calculated interpolation values can be more close to true values.
Therefore, the sound data {y"(0), y"(k.times.1), y"(k.times.2), . . . } outputted from the sound data output terminal 8 of the pitch shifter of FIG. 15 is reduced in signal distortion at high frequencies, compared with the sound data {y(0), y(k.times.1), y"(k.times.2), . . . } outputted from the sound data output terminal 8 of the pitch shifter of FIG. 19. Therefore, as the oversampling ratio is larger, signal distortion at high frequencies becomes smaller.
As described above, the conventional pitch shifter operates based on the principle of crossfading compression/extension. Also, the conventional pitch shifter carries out linear interpolation if the pitch shift ratio has a decimal part. Therefore, an acoustic signal can be shifted in pitch to an arbitrary level with a high degree of accuracy. However, interpolation values produced through linear interpolation differ at high frequencies from true values. Thus, in the conventional pitch shifter, distortion in acoustic signal at high frequencies (hereinafter referred to as "high-frequency distortion") is a serious problem.
To solve the problem, it has been suggested that oversampling is further performed in the conventional pitch shifter. That is because oversampling can reduce the difference between the interpolation values produced through linear interpolation and the true values, and thus can reduce high-frequency distortion. The effect of reduction in high-frequency distortion becomes more significant as the oversampling ratio is larger.
However, the above-structured conventional pitch shifter is further provided with not only the oversampler 11 but also the downsampler 12, and thus becomes greatly increased in size.
Moreover, in the above-structure conventional pitch shifter, the oversampler 11 and the downsampler 12 have to execute the filter operation in the cycle {T.times.N.sup.-1 } when carrying out N-fold oversampling. Then, as a result of N-fold oversampling, the number of sound data is increased by N times, compared with the number of sound data when oversampling is not performed. Thus, the buffer in the memory unit 1 has to be increased by N times in capacity. Also, the crossfader 3 and the interpolators 10a and 10b have to operate in the cycle {T.times.N.sup.-1 }. In short, as the oversampling ratio becomes larger, the buffer in the memory unit 1 has to be larger in capacity and the low-pass filters 14a and 14b of the oversampler 11 and the downsampler 12, respectively, the interpolators 10a and 10b, the crossfader 3, and other components have to operate faster. Therefore, the pitch shifter becomes sharply increased in cost.