Recently, with development of the digital technology, digital recording/storing media such as compact disks (CD's) have been put into practice and used widely. The widespread use of digital recording/storing media are based on development of the A/D conversion technology for converting analog data into digital data. In particular, the technology for A/D converting audio signals such as voice data or music data is prevalent in the fields of domestic audio products and communication data equipment.
Further, because digitizing audio signals increases an amount of data, the compressing/reproducing (coding/decoding) technology for compressing digital data during transmission and reproducing the digital data compressed and transmitted during reception has been developed and practiced in the field of communication data equipment for the purpose of reducing a communication cost. Also, the compressing/reproducing technology for recording and storing a larger amount of digital data has been researched and developed in the fields of communication data equipment and domestic audio products, aiming at not only a higher compression rate but also improved sound quality in auditory sense. As one high-efficient coding technique intended for a higher compression rate and improved sound quality in auditory sense for audio signals in digital recording/storing media, there is proposed a high-efficient coding technique for audio signals using subband coding and auditory characteristics, for example, as described in "Nikkei Electronics", Jul. 22, 1991 and Aug. 5, 1991. With this high-efficient coding technique, an input signal is sampled and divided to form frames each comprising 384 samples, these frames are classified one by one through band-pass filters (band width; 750 Hz) for each of bands, and bit allocation for each band is determined based on a threshold obtained from an auditory masking effect (using an amplitude spectrum with 512-point FFT) and a minimum audible limit characteristic (with simple tone of 1 kHz as reference), thereby compressing audio signals (sampling frequency; 48 kHz, quantization; 16 bits), comparable to those used in CD's, into 128 kbits/sec.
FIG. 11 shows a block diagram based on a basic algorithm of the high-efficient coding technique for audio signals using subband coding and auditory characteristics. In FIG. 11, it is assumed that an input signal is processed in units of frames each corresponding to 8-20 ms (about 384-1000 samples) and a group of 32 band-pass filters are used for the subband coding. A block denoted by 1 in the drawing is a subband coding block. The subband coding block 1 includes a group of band-pass filters (BPF1-BPF32) each comprising an FIR type polyphase filter with 64-512 taps. Output signals of the band-pass filter group are each converted into a base band signal through a thinning-out process in a thinning-out block, coded in a quantizing/coding block, and then output as coded data. In parallel to the above coding process, with in an auditory characteristic block denoted by 2 in the drawing, the input signal is subjected to 512 to 1024-point FFT (Fast Fourier Transform) in an FFT block for determining an amplitude spectrum. Subsequently, a threshold of the amplitude spectrum is obtained in a threshold calculating block within the auditory characteristic block 2 by applying models of an auditory masking effect and a minimum audible limit characteristic. Bit allocation for output signals of the band-pass filters are determined from the threshold. Then, the coded data is combined with auxiliary data (such as bit allocation and a scale factor for quantization) in a multiplexing block within the subband coding block followed by delivering final output signal at 128 kbits/sec.
In other words, the subband coding is a coding technique with which an input signal is divided into a plurality of frequency bands by the use of band-pass filters, and auditory characteristics (auditory masking effect and minimum audible limit characteristic) are positively utilized in judgment of a data amount (bits) to be allocated for each band to remove inaudible frequency components, thereby realizing the data compression. Utilizing such a subband coding and auditory masking effect requires a lot of filter operations and hence operation processing using a plurality of DSP (Digital Signal Processors).
In the field of the quantizing technology for digitalizing audio signals, vector quantization is also known as a technique which has been developed since 1980's. The vector quantization is a quantizing method with which a set (vector) of plural values is expressed together by one code, with an intention of directly utilizing redundancy between sample values for data compression.
The vector quantization is carried out as follows. The same quantization representative vectors are stored in both the transmission and reception sides. In the transmission side, a plurality of sample values are quantized together as a vector. For example, after sampling an input waveform, series of two-dimensional vectors each comprising two successive sample values as elements are produced. Upon receiving one of the two-dimensional vectors, one of the two-dimensional quantization representative vectors stored in both the transmission and reception sides which is closest to the input vector, is selected and an index of the selected vector is transmitted to the reception side after coding it. The reception side issues the quantization representative vector in accordance with the transmitted index. In other words, the vector quantization is to divide an input waveform into waveform segments in units of sample values each having a constant length, select one of finite waveform pat terns expressed in the form of quantization representative vectors which is most close to the waveform segment, and further replace the other waveform segments with such waveform pat terns successively, thereby compressing input data.
Accordingly, conditions affecting a performance of the quantization lie in how to search for the quantization representative vector which provides a minimum strain with respect to the input vector. To express such optimizing conditions in terms of two-dimensional space, for example, when a K-dimensional signal space in which the input vector exists is divided into N areas which are not overlapped with each other, N pieces of quantization representative vectors loaded to the N areas in one-to-one relation are each given as the centroid of the corresponding area. Each area satisfying the above optimizing conditions is called a "Voronoi area" and a manner of the division is called "Voronoi division". In the quantizing operations to determine the Voronoi areas, the number of levels N (a set of quantization representative vectors to be replaced with the input vector: code book number) is provided by 2.sup.K.multidot.R if a coding rate R and the number of dimensions K per sample are specified. From that calculation, the memory capacity required for loading the quantization representative vectors is given by N.times.K.
For designing a vector quantizer to carry out such a vector quantizing process, there is known a design method, called an LBG algorithm, using a learning series. In the LBG algorithm, starting from an appropriate initial code book, dividing conditions and representative point conditions are repeatedly applied to the learning series so that convergence into a satisfactory code book is achieved. A process flow of the LBG algorithm is shown in FIG. 12 and described below with reference to the drawing.
Note that expressions denoted by "X, Y" in the following description represent vectors.
In FIG. 12, initialization is first performed in step P1. It is assumed that the initialization sets the number of dimensions K, the number of levels N, an initial code book C.sub.N.sup.(0) comprising N pieces of initial quantization representative vectors Y.sub.1.sup.(0), Y.sub.2.sup.(0), . . . , Y.sub.N.sup.(0), a learning series T comprising L pieces of K-dimensional learning vectors X.sub.1, X.sub.2,. . . , X.sub.L, and a convergence determining threshold .epsilon.. Additionally, m=0 and an initial strain D.sup.(-1) =.infin. are also set. Then, a process of applying dividing conditions and calculating an average strain is performed in step P2. In this calculation process, the dividing conditions for division P.sub.N.sup.(m) of the learning series T into N areas P.sub.1.sup.(m), P.sub.2.sup.(m), . . . , P.sub.N.sup.(m) so that the average strain becomes minimum under a code book C.sub.N.sup.(m) comprising quantization representative vectors X.sub.1.sup.(m), Y.sub.2.sup.(m), . . . , Y.sub.N.sup.(m), are determined by applying the following equation (1): EQU d(X,Y.sub.i).ltoreq.d(X, Y.sub.j) (1)
In other words, the area P.sub.i.sup.(m) corresponding to the quantization representative vectors Y.sub.i.sup.(m) is given by a set of those learning vector which provides a minimum strain with respect to Y.sub.i.sup.(m) of the N quantization representative vectors. Thus, the L learning vectors are divided into the N areas. The average strain D.sup.(m) is also calculated which is produced when the learning vectors belonging to each area are replaced with the quantization representative vector in that area.
Thereafter, a convergence determining process is performed in step P3. If (D.sup.(m-1) -D.sup.(m) /D.sup.(m) &lt;.epsilon.) is satisfied in the convergence determining process, then the process is stopped and the code book C.sub.N.sup.(m) is output in step P4 as a finally designed code book of N levels. If the convergence is not determined, then the process goes to step P5. Step P5 carries out a process of applying representative point conditions. In this application process, a code book CC.sub.N comprising N pieces of quantization representative vectors Y.sub.1, Y.sub.2, . . . , Y.sub.N, which provide a minimum average strain with respect to the learning series T divided into the N areas P.sub.1.sup.(m), P.sub.2.sup.(m), . . . , P.sub.N.sup.(m), is determined by applying representative point conditions given by the following equation (2): EQU Y.sub.i =(.intg..sub.pi X.multidot.p(X)dx)/(.intg..sub.pi p(X)dx) (2)
In other words, the centroid provided as an average vector of the learning vectors belonging to the area P.sub.i.sup.(m) is set to the quantization representative vector Y.sub.i. Subsequently, replacement of m.rarw.m+1 and of CC.sub.N with a code boor C.sub.N.sup.(m) is executed in step P6, followed by returning to step P1 .
Whether the code book designed by the above algorithm is satisfactory or not strongly depends on the initial code book C.sub.N.sup.(0) and a method of selecting the learning series It is desired that the initial code book C.sub.N.sup.(0) covers a distribution range of supposed input vectors.
On the other hand, the learning series T must include a number of learning vectors enough to represent characteristics of supposed input vectors. If the number of learning vectors is small, a desired result would not be obtained for an input vector series different from the learning series. In order to prevent such an event, it is empirically known that the number of learning vectors must be on the order of at least several tens to hundred times the number of levels N.
However, the prior art high-efficient coding techniques mentioned above have had the problems below. In the subband coding technique, because auditory characteristics (auditory masking effect and minimum audible limit characteristic) are positively utilized to remove inaudible frequency components for thereby realizing the data compression, a lot of filter operations are required; a plurality of band-pass filters and DSP's are required. This complicates not only the circuit configuration but also operation processing, and increases a production cost.
Further, in the above-mentioned vector quantization, the memory capacity necessary for storing the quantization representative vectors increases in an exponential function of the number of dimensions K and the coding rate R, and the quantizing operations require N.times.K times multiplications for each input vector, which rapidly increases an amount of quantizing operations. Therefore, enlarging the number of dimensions will encounter a limit and the feasible number of dimensions is about eight. It is thus impossible to determine the quantization representative vectors in the higher number of dimensions than eight by any existing computer within the practical finite range of time.
Stated otherwise, the above prior art LBG algorithm for designing the vector quantizer aims at convergence to a satisfactory code book by applying the foregoing equation (1) as dividing conditions for the division P.sub.N.sup.(m) of the learning series T, and applying the foregoing equation (2) as representative point conditions for the code book CC.sub.N comprising the N pieces of quantization representative vectors Y.sub.1, Y.sub.2, . . . , Y.sub.N which minimizes the average strain, while alternately applying those equations in a repeated manner. However, convergence to an optimum code book is not always ensured. Also, that design method for the vector quantizer can determine the above-mentioned Voronoi division through quantizing operations within the practical finite range of time when the number of dimensions is as low as one or two, but it is not realistic when the number of dimensions K is set to a larger value in conformity with practical input vectors, because a period of time required to execute quantizing operations for determining the Voronoi division is rapidly increased in an exponential function, making the operations difficult to execute.
In view of the above, there has also been an attempt to approach, by using the neural network technology, the various problems experienced when the prior art vector quantizing method is used to determine the code book satisfying the Voronoi division.
A neural network is directed to solve those problems which are beyond a capability of sequential processing type computer systems having been so far developed, by using parallel processing type computer systems in which a plurality of processors are closely coupled to each other for realization of parallel processing after the example of brain's nerve cells, and it exhibits a superior ability for, in particular, problems related to patterns such as pat tern mapping, pattern perfection and pattern recognition. Voice synthesis and voice recognition are one example of the fields including many those problems related to pattern recognition, and various methods having been proposed in the neural network technology are applicable to voice recognition. Various neural network techniques can be grouped in terms of pattern sorters depending on whether an input signal takes a binary pattern or a continuous value, whether coupling efficiency is learned with a teacher or without a teacher, and so on. When applying any neural network technique to voice recognition, the technique adapted for an input pattern taking a continuous value is employed because voices are continuously changed in time series. Of the neural network techniques adapted for continuous values, the technique using a teacher includes a perceptron, a multilayered perceptron, etc., and the technique using no teacher includes a self-organizing feature mapping. Then, one of those neural network techniques which realizes the Voronoi division in the above-mentioned vector quantizing method is a self-organizing feature mapping.
The self-organizing feature mapping is a mapping technique for neural networks proposed by Kohonen at the Helsinki University, and its network structure is simple because learning is made without a teacher. Thus, the network structure comprises neurons in the form of combined layers interconnected on a layer-by-layer basis and input synapses coupled to all the combined layers. In the self-organizing feature mapping, summation of products of an input signal and synapse loads is calculated and one particular neuron in the combined layers is brought into an output state. The synapse loads are then changed so that the particular neuron brought into an output state and other neurons in the vicinity thereof react most sensitively with a similar input signal. At this time, since the neuron first brought into the highest output state seems to maximize its output in competition with the other neurons for the same input signal, such a learning process is also called coordinate learning. This coordinate learning gives rise to a phenomenon that when reactivity of a particular neuron is raised for an input signal, reactivity of those neurons which are positioned at short distance from that neutron is also raised concomitantly, but reactivity of those neurons which are positioned outside the above neurons is lowered conversely. Those neurons of which reactivity is raised owing to such a phenomenon are called a "bubble" together. In other words, the basic operation of the self-organizing feature mapping resides in generating a particular "bubble" on the two-dimensional plane of neurons in accordance with an input signal. Then, synapse loads for neural network are automatically modified so that the "bubble" reacts selectively with a particular input signal (stimulus) while gradually growing up. As a result of the above operation, the particular input signal is eventually made correspond to the "bubble", making it possible to automatically sort the input signal.
Accordingly, the pattern data which is eventually sorted and output by virtue of the synapse loads in the self-organizing feature mapping corresponds to the quantization representative vector which is obtained by the above-mentioned vector quantizing method as the centroid in the Voronoi division.
However, the method for sorting an input signal by the use of the self-organizing feature mapping has been tried only for a map configuration of the two-dimensional plane in the past. This has raised the problem that as illustrated in FIG. 13, for example, an input signal may not thoroughly be sorted in the two-dimensional plane with the occurrence of a "distortion", and the input signal corresponding to the "distortion" part will not reach a coincidence and hence cannot be sorted. In the case of sorting and coding sound data by the use of the two-dimensional self-organizing feature mapping, for example, if any "distortion" occurs in the map, the sorting of the coded sound data is not performed with a good result; i.e., a code book in the vector quantizing method is not converged satisfactorily, leading to that the reproduced sound is not clear when hearing it and high-efficient coding of sound data utilizing the self-organizing feature mapping is impeded from being put into practice.
Further, in order to correct the "distortion" caused by the self-organizing feature mapping in the two-dimensional plane and obtain a good sorting result, a great deal number of learning times is required, which also impedes practical use of the self-organizing feature mapping.