There are chaos theoretical exponent values in accordance with the Chaos Theory such as a correlative dimension, KS entropy, Lyapunov exponent and the like. The Lyapunov exponents, which are relatively easier to be calculated, are used for the assessment of chaoticity of a phenomenon which gives a time series signal. It is common to analyze a time series signal, in particular a signal having periodic characteristics such as a speech voice, by calculating a first Lyapunov exponent or a Lyapunov spectrum.
The maximum Lyapunov exponent, or the first Lyapunov exponent in the Lyapunov spectra, is calculated in general in order to determine the chaoticity of a time series signal (here the chaoticity means the characteristics of fluctuations or the characteristics due to fluctuations specific to a system). The system used to calculate the exponents includes systems in accordance with various procedures such as Wolf's algorithm, Kantz' algorithm, Rosenstein's algorithm, Orel's algorithm, Sano/Sawada's algorithm and the like. The system in accordance with Sano/Sawada's algorithm is a typical example among these.
When using the system in accordance with any one of those algorithms, the system evaluates an attractor constructed in a phase space from a time series signal, and the Lyapunov exponent to be calculated is calculated with respect to the neighborhood points set constructed in the attractor, the value of which depends on the constituting method of the neighborhood points set. In order to calculate a correct Lyapunov exponent, it is very important that a manifold that includes the neighborhood points set (spheres and cubes and the like in a third dimension, hyperspheres and hypercubes and the like in a fourth dimension) is appropriately set with respect to the size of the attractor. In case where the time series signal includes noises which disturb its chaoticity, the appropriate range of the size of the manifold which includes the neighborhood points set with respect to the size of the attractor, is known to become smaller. From these facts, the evaluation of the level of noises that disturb the chaoticity, and that are included in the time series signal is made possible, by varying the size of the manifold which includes the neighborhood points set and checking the relationship with the calculated Lyapunov exponents.
Examples of such prior art include “Nonlinear Time Series Analysis” by Holger Kantz and Thomas Schreiber, UK, Cambridge Nonlinear Science Series 7, 1997, and “Measurement of the Lyapunov Spectrum from a Chaotic Series” by Sano M. and Sawada Y., Physical Review Letters, vol. 55, No. 10, 1985, pp. 1082-1085, JP-A-H07-116119, JP-A-H09-259107, JP-A-H09-308614, JP-A-H11-212949, JP-A-2000-1133347 and JP-A-2002-306492.
The Lyapunov exponents in general are calculated for the evaluation of chaoticity in a time series signal, and thus evaluated time series signal is said to be chaotic when the maximum Lyapunov exponent or the first Lyapunov exponent in the Lyapunov spectrum is positive.
The Lyapunov exponents or in general the Lyapunov spectra are calculated with respect to the strange attractors constructed in an embedding space for which a dimensions is set preliminarily, from the time series signal. In the calculation system, which may be of any kind, it calculates the maximum Lyapunov exponent, or the Lyapunov spectra in the system using the Sano/Sawada's algorithm, from the relative position relationships of many or all of the points that constitute the strange attractor.
The maximum Lyapunov exponent, or the first Lyapunov exponent in the Lyapunov spectrum (referred to simply as the first Lyapunov exponent or Lyapunov exponent hereinbelow) is an exponent for the dispersion velocity when each point neighboring one other on the strange attractor separates from each other with the passage of time.
In any of the systems, the Lyapunov exponents constitute the neighborhood points set generated from the neighborhood condition to be set as the ratio with respect to its size on the strange attractor constructed in the embedding space, and are calculated as the mean value when points constituting the neighborhood points set separate from each other.
The conventional chaos theoretical exponent value calculation system uses one of such systems as mentioned above, and those systems have a presumption that it analyses a system of stable dynamics (the dynamics is the behavior limited by its physical form and the like or the property that provides the behavior) (a system of stable dynamics means a system with physically invariable disposition or length, and the shape of strange attractor generated from the time series signal provided by the system becomes a similar form if such a system behaves chaotically). Thus the temporally local first Lyapunov exponent in the system with its temporally changing dynamics, or the Lyapunov spectrum in the Sano/Sawada's algorithm, cannot be calculated as a significant value (a system with its temporally changing dynamics refers to a system such as the human vocal organs, for example, in which the physical disposition or length changes. For instance, when phonemes /a/ and /o/ are pronounced, the shapes of throat and oral cavity are different, and the strange attractors thereof generated from the speech voice signal are different. The shape of a strange attractor for the phoneme /a/ is shown in FIG. 8, while the shape of a strange attractor of the phoneme /o/ is shown in FIG. 9. The strange attractor of /o/ cannot be obtained even when the fluctuation of strange attractor of the phoneme /a/ is enlarged or noises are added thereto).
For instance, in the analysis of a generic speech voice signal and the like, which is an exemplary system with a temporally changing dynamics, because a plurality of vowels change in a complex manner in a short period of time, the difficulty of analysis is extreme when compared with the system using a conventional methodology. So far it is almost impossible to calculate a temporally local first Lyapunov exponent in a system with a temporally changing dynamics such as an ordinary speech voice signal.
Even with a system which calculates the temporally local first Lyapunov exponent by combining the above-mentioned method with a statistical procedure, when the system uses any one of conventional methods for the calculation of the first Lyapunov exponent, it is not easy to sufficiently reduce the processing unit time as compared to the required temporal resolution, while obtaining stable processing results.
For example, in a combination of conventional methods, it is difficult to secure the chaos theoretical exponent value for a short period of time not more than one second at an effective precision, and the first Lyapunov exponent can be calculated only when the signal to be processed has a sufficient SNR (signal-to-noise ratio), namely when a signal consisting of a clear single vowel by one unique speaker can be processed in case where the time series signal is a speech voice signal. More specifically, if the SNR of the signal to be processed is poor or a plurality of phonemes are mixed, then the calculation of the first Lyapunov exponent can not be performed.
The voice (signal) for one period of time of processing unit is needed to be first extracted from a continuous speech voice (a signal to be subjected to an analysis), namely a system with temporally changing dynamics, when executing the processing of a continuous speech voice in a conventional system, although any one of the conventional systems is applied. In case when it is decided that the voice (signal) is input to the system from a tape recorder, and if the system mechanically chops the voice to be processed (a signal to be processed) for the size of processing unit time, tens of milliseconds of difference in timing of depressing the play button of the tape recorder will result in several tens percent of change in the first Lyapunov exponent, which is calculated from the voice signal contained in each processing unit, as a processing result.
For example, in case when a continuous speech voice is to be processed, and when it is decided that the processing unit is one second, a difference of the cutting-out timing of 0.1 second will cause 10 to 30% of change in the first Lyapunov exponent per each second even with the system using the Sano/Sawada's algorithm, which is reputed to be able to calculate at a relatively high precision in the conventional systems, so that the difference of the temporal mean value of the first Lyapunov exponents will be not less than a few percent, which difference is caused by the 0.1 second of difference of the cutting-out timing of the processing unit, in case where the time slice width for calculating temporal mean values is approximately 5 minutes, even when change is averaged temporally so as to be reduced.
On the other hand, the temporal mean value of the first Lyapunov exponents of a speech voice is thought to have a close correlation with the fatigue level accumulated in the speaker, and is thought to be able to evaluate the stresses with respect to each of speech contents of the speaker from the speech voice, if the time slice used to calculate the temporal mean value is further shorten. However, as it is clear from the foregoing description, in the conventional methods, it is not possible to make the temporal resolution to not more than 5 minutes, as well as it is limited to only quantize the mid- to long term stresses if the reliability of index value is in only one significant digit, thus the real time or quasi real time evaluation of stresses in the speech of a speaker is impossible. Therefore a stable processing result cannot be obtained from a continuous speech voice (i.e., from a system with a temporally changing dynamics). The term “stable” means that the processing result does not almost vary by a minute change of parameters.
In order to only verify that the first Lyapunov exponent has a positive value for the purpose of verifying the chaoticity of the time series signal, it is not necessarily needed to be specially nervous about the neighborhood condition, and rather it is sufficient to set the neighborhood condition in such a way that a sufficient number of neighborhood points exist, for example in such a way that the radius of a neighborhood sphere (or a neighborhood hypersphere) is in the order of a few percent of the radius of a sphere (or a hypersphere) that includes the strange attractor. However, it will be quite important to appropriately set the neighborhood condition if one desires an exact calculation of the Lyapunov exponent.
In particular, if some noises such as white noises for some reason or other, such as by the precision level of the measuring system, are convoluted on a chaotic time series signal, the setting of neighborhood condition for accurate calculation of the Lyapunov exponent of the system for which the time series signal is provided will be very complex, when compared to the processing of an ideally chaotic time series signal.
When the neighborhood condition is set to the radius (ε) of the neighborhood sphere as stated in the above example, the first Lyapunov exponent will be correctly calculated in the range ε0<ε<ε1, as schematically shown in FIG. 5, for an ideally chaotic time series signal. If on the other hand white noises are convoluted thereon, then the relationship between the radius of neighborhood sphere and the first Lyapunov exponent calculated pro form a will vary based on the ratio as schematically shown in FIG. 6, and if stronger noises are convoluted the first Lyapunov exponent cannot be calculated as similar to the case of FIG. 5.
It may be sufficiently reasonable to consider that some noises are included, even when the voice is a simple continuous phoneme of /a/ in a speech voice, for example, thus there exists a problem that the Lyapunov exponent of the voice system that produces the phoneme /a/ cannot be correctly calculated as compared to the calculation of the Lyapunov exponent of the system from a time series signal generated by a mathematical system.