The present invention relates to a parameter estimation apparatus and a parameter estimation method and, more particularly, to improvement of stability when estimating parameters by using a neural network which stores information and operates adaptively to an object or environment.
The invention also relates to a parameter estimation control device and a parameter estimation control method and, more particularly, to those estimating parameters relating to control of a control object by using a neural network, and controlling the object in accordance with the estimated parameters.
Furthermore, the invention relates to a learning control device and a learning control method and, more particularly, to learning control which enables highly precise follow-up control to a target value when calculating a learning control quantity by using an output from a neural network.
Current digital computers used for calculation or control are stored program computers and consecutive sequence computers which are called xe2x80x9cvon Neumann architecturexe2x80x9d. On the other hand, there have been many studies on xe2x80x9cneural networksxe2x80x9d based on models of connected neurons which manage the function of human brain. Applications of neural networks for estimation or control have been proposed in various fields, for example, the field where pattern processing, which is von Neumann computer""s week subject, is required, or the field where an object has strong non-linearity and so is hard to be analyzed. In some fields, neural networks have been put to practical use.
That is, even when it is difficult to theoretically derive a causal relation between an input and an output in a physical or chemical system, a neural network enables estimation of an output value from an input value according to its learning function. Taking this advantage, in recent years, neural networks have been applied to control devices for controlling complicated control systems, especially, control devices for controlling objects of strong non-linearity.
A neural network has a plurality of multi-input and multi-output elements called xe2x80x9cunitsxe2x80x9d which are neurons simplified as model systems, and generates or changes the interconnections of the units by learning. These units form a feed-forward type hierarchical network or a feed-back type interconnection network.
FIG. 52 is a diagram for explaining a hierarchical network in which units form a multi-layer structure. In such neural network, a plurality of intermediate layers reside between an input layer to which an object to be processed by the neural network is input and an output layer from which the processing result is output. The units included in each layer form connections with the units in the adjacent layer, and these connections are represented by connection weights or connection coefficients. The construction of these connections is formed by learning to output a desired signal with respect to a specific input. As a learning method useful for a hierarchical neural network as shown in FIG. 52, there is a back propagation method. This method attracts attention as it is able to provide a neural network which can be constructed with technologically realizable number of units. Furthermore, there are two learning methods, xe2x80x9csupervised learningxe2x80x9d and xe2x80x9cunsupervised learningxe2x80x9d. In the former, an output (signal) is given from the user, and in the latter, the neural network forms its own construction according to the statistical characteristics of an input signal. One of these methods is selected according to the application of the neural network.
Generally, in the hierarchical neural network shown in FIG. 52, control using the neural network is performed as follows. A region where control is to be executed is defined as a learning domain. Parameters required for control are estimated by using the neural network which has learned within the learning domain, and control is performed using the estimated parameters.
FIG. 53 is a diagram for explaining a method of calculating an estimate value (parameter) in the conventional neural network. With reference to FIG. 53, 1501 is an object to be controlled (hereinafter referred to as a control object), and 1502 denotes a neural network (NN) operation unit. An input and an output to/from the control object 1501 are U and Y, respectively. An operation parameter Z including time series data of these input and output is input to the NN operation unit 1502, and the processing result is obtained as an output (estimate value) X. By using the estimate value X so obtained, the input control quantity U to the control object 1501 can be calculated so that the output Y from the control object 1501 becomes a target value. The neural network (NN) of the NN operation unit 1502 has a three-layer structure which has one intermediate layer in the hierarchical network shown in FIG. 52, and inter-layer outputs are obtained by function operation such as a sigmoid function.
As an example of a control system using a neural network, there is an air-to-fuel ratio controller for an internal combustion engine of a motorcar. xe2x80x9cAir-to-fuel ratioxe2x80x9d is the ratio of air to fuel in the intake gas of the engine. Examples of air-to-fuel ratio controllers are as follows: a motorcar control device disclosed in Japanese Published Patent Application No. Hei. 3-235723, an air-to-fuel ratio controller disclosed in Japanese Published Patent Application No. Hei. 8-74636, and a parameter estimation device disclosed in Japanese Published Patent Application No. Hei. 11-85719 (Application No. Hei. 9-238017)
The advantage of using a neural network in an air-to-fuel controller for an internal combustion engine of a motorcar is as follows.
With respect to NOx, CO, and HC which are noxious gases included in an exhaust gas from a motorcar, regulations in various countries must be cleared. So, there is adopted a method of reducing the noxious gases by using a catalyst. As a typical catalyst, a ternary catalyst is used.
FIG. 54 illustrates the outline of an air-to-fuel ratio controller. An air flowing into an engine according to the opening degree of a throttle (TL) is mixed with a fuel injected from a fuel injection unit (INJ), and the mixture flows through a valve V1 into a combustion chamber, wherein explosion occurs. Thereby, a downward pressure is applied to a piston (P), and an exhaust gas is discharged through a valve V2 and an exhaust pipe. At this time, the air-to-fuel ratio is detected by an air-to-fuel ratio sensor AFS, and the exhaust gas is purified by a ternary catalyst (TC). To make the catalyst effectively purify the noxious gases, it is necessary to keep the air-to-fuel ratio constant, i.e., at 14.7, so that the catalyst can work effectively. For this purpose, an air-to-fuel ratio controller which can keep the air-to-fuel ratio constant regardless of the motorcar""s operating state is required.
In the air-to-fuel ratio controller constructed as described above, usually, feed-forward control is carried out, that is, increase or decrease in the quantity of fuel to be injected is corrected according to change of the throttle""s opening degree or the like and, further, feed-back control is carried out as well. These controls secure successful results in the steady operation state such as idling or constant-speed driving. However, in the transient state such as acceleration or deceleration, it is very difficult to keep the air-to-fuel ratio constant by only the simple feed-forward/feed-back operation because of factors which are difficult to analyze, for example, a delay in response of the air-to-fuel ratio sensor, and successive change in the quantity of fuel actually flowing into the cylinder according to the driving state or external environment.
So, in order to improve the precision of air-to-fuel ratio control, a neural network learns non-linear factors such as the above-described fuel injection, and correction of the fuel injection quantity is controlled by using this neural network to improve the response characteristics in the transient state.
Such air-to-fuel ratio controller has already been known and, as an example, an air-to-fuel ratio controller disclosed in Japanese Published Patent Application No. Hei. 8-74636 will be described hereinafter.
FIG. 56 illustrates the structure of the air-to-fuel ratio controller which is somewhat simplified. In FIG. 56, E denotes an engine, 210 denotes a state detection unit for detecting the state of the engine E, 220 denotes a neural network (NN) operation unit for estimating the air-to-fuel ratio according to the state detected by the state detection unit 210, and 230 denotes a fuel injection quantity calculation unit for calculating the fuel injection quantity according to the result of the operation of the NN operation unit 220.
A description is given of the operation. In the air-to-fuel ratio controller, the state detection unit 210 detects a plurality of physical quantities indicating the state of the engine E, i.e., the engine speed (Ne), the intake air pressure (Pb), the throttle opening degree (THL), the fuel injection quantity (Gf), the intake air temperature (Ta), the cooling water temperature (Tw), and the detected air-to-fuel ratio (A/Fk). The NN operation unit 220 receives these parameters detected by the state detection unit 210, and estimates the behavior of the real air-to-fuel ratio (A/Fr) by the neural network, which behavior cannot be followed by an air-to-fuel ratio sensor in the state detection unit 210 in the transient state. Then, the fuel injection quantity calculation unit 230 performs feed-back control so as to minimize an error between the estimated air-to-fuel ratio (A/Fnn) and the target air-to-fuel ratio (A/Fref), and calculates a fuel injection quantity (Gb) which realizes the target air-to-fuel ratio. In this way, the air-to-fuel ratio (A/Fr) in the transient state, which cannot be obtained by the ordinary sensor because of the sensor""s response delay or the like, can be obtained by using the neural network, whereby appropriate control of the air-to-fuel ratio is realized.
FIG. 57 illustrates an example of a neural network used in the NN operation unit 220. As shown in FIG. 57, this neural network is composed of three layers: a first layer L1 as an input layer, a second layer L2 as an intermediate layer, and a third layer L3 as an output layer.
The following parameters are input to the first layer L1 from the state detection unit 210 shown in FIG. 56: the engine speed (Ne), the intake air pressure (Pb), the throttle opening degree (THL), the fuel injection quantity (Gf), the intake air temperature (Ta), the cooling water temperature (Tw), and the detected air-to-fuel ratio (A/Fk). This air-to-fuel ratio (A/Fk) is the latest air-to-fuel ratio detected by the air-to-fuel ratio sensor in the control cycle. However, this air-to-fuel ratio (A/Fk) is not the real air-to-fuel ratio (A/Fk) because of response delay of the sensor. Each of the parameters input to the first layer L1 is multiplied by a weight based on the learning result, according to its synapse SY. In the second layer L2, the sum is calculated at each neuron NR and then a threshold is given to the sum and, thereafter, it is converted to an output value according to a non-linear transfer function. Each of the output values from the second layer L2 is multiplied by another weight W, and the sum is calculated at a neuron NR in the third layer L3. Then, another threshold is added to this sum at the neuron NR in the third layer L3, and the result is converted according to another transfer function, whereby an estimated air-to-fuel ratio (A/FNN) is obtained.
In the conventional air-to-fuel ratio controller so constructed, satisfactory air-to-fuel ratio control is achieved by the estimation process using the neural network, even in the transient state of the engine operation where a correct air-to-fuel ratio cannot be obtained from only the output of the ordinary sensor. However, in a control system using a neural network as described above, the stability of the control system cannot be always assured. That is, a general neural network (hereinafter referred to as NN) is a black box and so its internal structure is not clear. Therefore, it is impossible to theoretically assure the stability for all the inputs. To be specific, even when each input parameter is within the learning domain, if an input pattern different from the input parameter used for learning is input to the NN, it cannot be theoretically assured that the NN output always calculates a correct value (within an allowable estimate error). This will be described in more detail by using FIG. 58. In FIG. 58, NN1 is a neural network having generalization ability, which has correctly learned according to obtained data. On the other hand, NN2 is a neural network having no generalization ability, which has not correctly learned, and so its output varies significantly if the input pattern differs even a little. As is evident from FIG. 58, it is impossible to assure the stability of the NN-based control system. Therefore, in development of NN-based control devices which can be put to practical use, a great number of verification tests must be repeated, resulting in increased number of processes, cost, and time for development.
Furthermore, Although the NN output values corresponding to all the input patterns can be checked by calculation, the calculation time becomes considerable when the number of inputs to the NN is large, and so it is practically impossible to check the NN output values by calculation. Therefore, there is no method for assuring the stability of an object which shows complicated behaviors under various conditions, i.e., an object which actually needs to be controlled by the NN.
In order to assure the stability of NN control, there is proposed a method of using an NN offline as a tool for setting parameters such as control gains, instead of directly using NN outputs. In this method, however, the robust ability (the ability to constantly obtain stable outputs against variations) against parameter variations of the control object is reduced.
Meanwhile, in order to avoid the worst, there is proposed a method of providing the NN output with a limiter to nullify the output data. In this method, however, depending on the control object, it is sometimes necessary to design a control system which can clear the control target value even in the state where the NN output is limited. In this case, the advantage of using the NN is lost.
In the above-described examples, neural networks are employed to compensate the limitations of sensors. Hereinafter, a description is given of application of a neural network as a so-called software sensor. A software sensor functions as a substitute of a hardware sensor, by performing arithmetic processing.
FIG. 59 is a block diagram illustrating an ordinary control system using a sensor. In this system, when performing control to bring an output from a control object close to a target value, initially, the output value or a relating state quantity which becomes an index of the output value is detected, and an error between the detected state quantity and the target value is obtained. Then, by using a controller which is designed to bring the error to 0 (ZERO), feed-back control based on the detected value (output value or the index value) is performed to decide an input (control quantity) to the control object.
Turning to FIG. 59, the control system comprises a control object 1701, a sensor 1702 (sensor 1), and a control quantity operation means 1703. The control object 1701 outputs a state quantity to be brought close to a target value. The sensor 1702 is a high-performance sensor having sufficient precision for appropriate control. The sensor 1702 detects the state quantity output from the control object 1701 to output a detected value Y1. The control quantity operation means 1703 performs arithmetic operation to obtain an input value (control quantity) to be input to the control object 1701 in accordance with the detected value Y1.
In this control system, the control precision greatly depends on the precision of the sensor 1702 which detects the state. So, in order to appropriately perform feed-back control, the precision of the sensor 1702 must be sufficiently high. However, high performance sensors are generally expensive, and it is difficult to provide mass-produced models with such expensive sensors. Hence, neural networks are used as software sensors in place of high performance sensors. That is, a behavior equivalent to a high performance sensor is realized in a neural network by subjecting the neural network to learning using an output from the high performance sensor as a teaching signal.
FIG. 60 is a block diagram illustrating a learning system, for realizing a neural network which can serve as a substitution of the sensor 1702 shown in FIG. 59. As shown in FIG. 60, this learning system comprises a control quantity generator 1800, a control object 1801, a neural network 1802 (NN1), and a sensor 1804 (sensor 1).
The control quantity generator 1800 generates a control quantity to be input to the control object 1801. In this learning system, an assumed input region (operation domain) is defined as a learning domain, and the control quantity generator 1800 is set to generate a control quantity within the operation domain. The control object 1801 is an object to be subjected to control which uses a neural network as a substitution of the sensor 1804. The neural network 1802 is subjected to learning so that it becomes a software sensor to be used as a substitution of the sensor 1804. The sensor 1804 is a high performance sensor having a sufficient precision for appropriate control.
A description is now given of the operation of the neural network learning system so constructed. Initially, the control quantity generator 1800 generates a control quantity U and outputs it to the control object 1801 and the neural network 1802. The control object 1801 performs a predetermined operation to generate an output according to the control quantity U. The sensor 1804 detects the output of the control object 1801 and outputs a signal Y1 indicating the result of the detection, as a teaching signal, to the neural network 1802.
The neural network 1802 receives the control quantity U and a state quantity (usually, plural quantities) z indicating the state of the control object 1801, and outputs an estimate value Ynn according to these inputs. The estimate value Ynn is compared with the teaching signal Y1. Based on the result of the comparison, the connection coefficients of the neural network are learned so that the estimate value Ynn of the neural network 1802 becomes the teaching signal Y1 by, for example, the back propagation method.
In the learning system so constructed, the neural network 1802 can learn the behavior characteristics of the sensor 1804. Then, a control system is constituted by using the neural network 1802 as a software sensor, whereby a control quantity can be calculated so that the output of the control object becomes the target value by using the estimate value from the neural network 1802 instead of the detected value from the sensor 1804.
FIG. 61 is a block diagram illustrating a control system using a neural network which has learned as described above. As shown in FIG. 61, this control system comprises a control object 1901, a neural network 1902 (NN1), and a control quantity operation means 1903.
The control object 1901 is identical to the control object 1701 shown in FIG. 59, and this is an object to be controlled by the control system. The neural network 1902 is identical to the neural network 1802 which has learned in the learning system shown in FIG. 60. The control quantity operation means 1903 generates a control quantity to be input to the control object 1901 in accordance with an estimate value output from the neural network 1902.
The control system so constructed can perform control identical to that of the control system shown in FIG. 59, by using the estimate value Ynn output from the neural network 1902 instead of the quantity Y1 detected by the high performance sensor 1702 shown in FIG. 59.
For example, a calculator of von Neumann type architecture performs processing according to a calculation algorithm which has already been known, and an inevitable causal relation resides between an input and an output. Accordingly, as long as it is known that an output can be obtained from an input by using an algorithm, it is theoretically possible to estimate an output which can be obtained from another input. However, the situation differs when using a neural network.
The intermediate layer in the neural network constituting the hierarchical network shown in FIG. 52 is called xe2x80x9ca hidden layerxe2x80x9d, and the connection structure inside the neural network cannot be known, that is, it is in the black box state. Accordingly, only an input to the neural network and an output corresponding to the input can be known from the outside. So, even in a domain where learning has been performed, it is impossible to theoretically assure that stable outputs are obtained with respect to all inputs.
To be specific, in the case where a plurality of parameters are input to the neural network to obtain their estimate values, if patterns different from the input patterns used for learning are input even though each input parameter is within the learning domain, it is impossible to theoretically assure that the estimate values are appropriate (i.e., within an allowable estimation error). In this case, theoretically assured are only the estimate values of the input patterns used for learning.
Accordingly, in order to develop a control device using a neural network which has sufficiently high stability for practical use, it is necessary to repeat a great number of verification tests, resulting in increased number of processes for development.
On the other hand, to check the estimate values (outputs from the neural network) for all the input patterns by calculation is practically possible if the neural network is small in scale. However, in a neural network having a large number of inputs, such calculation takes a lot of time. Therefore, when a neural network is used for a control object showing complicated behaviors (i.e., a control object which actually needs to be controlled by the neural network), such calculation is impossible in fact and, consequently, the stability cannot be assured.
Hereinafter, a learning process of a conventional neural network will be described with reference to FIG. 62. FIG. 62 is a schematic diagram illustrating a learning process of a neural network which receives input parameters such as the engine speed (Ne) and outputs an estimated air-to-fuel ratio (A/FNN).
Initially, an engine E of a motorcar is provided with a state detection unit 210 which is identical to that shown in FIG. 56, and the state detection unit 210 collects data for learning when the motorcar is driven. The data so collected is input to a learning data generation unit 240a, wherein the data is converted to learning data in which the sensor""s response delay and the like are adjusted. The learning data is composed of an air-to-fuel ratio as teaching data (A/Ft) and input parameters (e.g., the engine speed (Ne)) corresponding to the air-to-fuel ratio. The air-to-fuel ratio as the teaching data (A/Ft) can be obtained from an air-to-fuel ratio (A/Fk) detected by the air-to-fuel ratio sensor, considering the detection delay. Although it is desirable that past data of the respective parameters are included, the past parameters are not included to simplify the description.
The learning data generated by the learning data generation unit 240a is stored in the learning data storage unit 240b. The learning execution unit 240c performs learning of the neural network (NN) by using the stored learning data.
To be specific, the learning execution unit 240c inputs the parameters such as the engine speed (Ne) to the neural network NN. In response to the input, the neural network NN outputs an estimated air-to-fuel ratio (A/FNNN). Then, the differentiator 240d detects an error e between the estimated air-to-fuel ratio A/FNN and the teaching air-to-fuel ratio A/Ft. The construction of the neural network NN (i.e., the weights of the synapses SY, the transfer functions, etc) is varied so that the error e becomes smaller than an allowable value, for example, 0.1 (average) when calculated in A/F ratio equivalent. When the error e becomes smaller than the allowable value or when the learning operation reaches a specified number of times, the learning process is completed.
Thereby, the construction of the neural network NN is defined with the weights of the synapses SY, the transfer functions, and the like at the time when the learning has been completed when an input is given to the neural network which has completed the learning, the neural network can output an air-to-fuel ratio (estimate value) having an error smaller than the allowable value, i.e., 0.1 (average) when calculated in A/F ratio equivalent.
The conventional neuro learning will be described by using a flowchart shown in FIG. 63. Initially, a learning data set (a set of a neuro input data sequence INP and the corresponding teaching signal Yt) is formed by using real data (step 201). Then, a neuro estimate value Ynn(i) is calculated by using the neuro input data sequence INP (step 202), and a performance function E (=xcexa3e2/2) based on an error e(i) between the teaching signal Yt(i) and the neuro estimate value Ynn(i) is calculated (step 203). Then, the neuro connection coefficients are updated so as to decrease the performance function E (step 204). Thereafter, a performance function Enew is calculated using the updated connection coefficients, and it is decided whether the error goal is achieved or not (step 205). When it is not achieved, the control returns to step 202 to advance the learning. The learning is ended when the error goal is achieved.
In the conventional learning control device so constructed, it is possible to make the neural network perform learning so that the error between the output of the neural network and the value of the teaching signal becomes smaller than a predetermined value. As the result of the learning, the construction of the neural network is changed from that before the learning.
However, in a control system using a neural network as described above, in order to improve the control precision, high-precision learning of the neural network is indispensable.
For example, when performing control to follow a target value by using an estimate value, the estimate value sometimes deviates from the actual value. When the estimate value deviates to the side where the actual value exists with respect to the target value, it shows a correct value in regard to the direction of correction, and control is performed in approaching the target value, resulting in favorable result. However, when the estimate value deviates to the opposite side from the actual value with respect to the target value, the direction of correction is inverted, whereby control is performed in getting away from the target value, resulting in degraded control precision.
That is, in the conventional learning flow shown in FIG. 63, since learning control is performed based on that the absolute value of a difference between the neuro estimate value and the target value is reduced. So, depending on the positional relationship between the neuro estimate value and the target value, the neural network after learning advances control in getting away from the target value, resulting in degraded control precision.
This will be explained taking the air-to-fuel ratio as an example. If the estimated air-to-fuel ratio deviates in the opposite direction from the target air-to-fuel ratio, there occurs a problem that the fuel becomes xe2x80x9cleanxe2x80x9d and the engine stops although the fuel must be xe2x80x9crichxe2x80x9d to accelerate the engine.
Furthermore, the above-described learning to improve the estimation precision is carried out such that learning data is not limited within a specific range, or learning data within the behavior range of the control object is prepared in advance. However, whether sufficient learning has been performed with the learning data or not is known only from the result of the control. So, usually, the following development routine must be repeated. That is, learning is performed and then evaluation for the learning is performed. Then, learning data in a domain having unsatisfactory evaluation result is collected to re-form the data, followed by re-learning. Here, xe2x80x9cdomainxe2x80x9d is a conception indicating the operating state of the engine, which is determined according to combination of regions of at least one input parameter.
Furthermore, based on the result of evaluation using evaluation data, when part of the learning data has a large error, usually the learning data in the domain having the large error is increased for re-learning to minimize the error. At this time, it is necessary to select additional data, considering the valance of distribution of the whole learning data, i.e., the balance of number of learning data in each domain.
However, the re-learning with the selected additional data sometimes results in the following drawbacks. That is, in a domain where a satisfactory result was obtained in the previous learning, no satisfactory result is obtained in the re-learning. Further, the learning result degrades as a whole.
In this case, selection of learning data must be performed again. Because of the repetitions of data collection and re-learning, the whole learning process takes a lot of time until a satisfactory result is obtained.
As described above, it is very difficult to improve the estimation precision of a neural network over a broad estimation domain, and at present, trial and error is the only measure to solve this problem.
It is an object of the present invention to provide a parameter estimation control device which can express a control object, which has conventionally been expressed by a large-scale neural network, by using a plurality of small NN constructions with high precision, and which can perform analysis of stability.
It is another object of the present invention to provide a parameter estimation control device which can estimate parameters by using a neural network to perform stable control, without increasing cost and processes for development.
It is still another object of the present invention to provide a parameter estimation control method which can estimate parameters by using a neural network to perform stable control, without increasing cost and processes for development.
It is a further object of the present invention to provide a learning control device and a learning control method for calculating control quantities which follow target values, by using output values from a neural network and, more particularly, to those capable of realizing high-precision learning in which estimate values enabling high-precision follow-up control are obtained, improving, by re-learning, only the estimation precision in a domain of poor estimation precision without affecting other domains, and performing precise and efficient learning.
Other objects and advantages of the invention will become apparent from the detailed description that follows. The detailed description and specific embodiments described are provided only for illustration since various additions and modifications within the scope of the invention will be apparent to those of skill in the art from the detailed description.
According to a first aspect of the present invention, there is provided a parameter estimation device for estimating parameters relating to input and output of a control object by using a neural network, and this device comprises: control domain division means for selecting at least one parameter which has strong correlation with non-linearity amongst input-output characteristics of the control object, as a parameter of a premise part of fuzzy operation, and dividing a control domain of the control object into a plurality of small domains by fuzzy estimation based on the parameter of the premise part; and estimation means for estimating dynamic behavior of the control object by using, as a consequent part of the fuzzy operation, a neural network which receives a parameter indicating the operation state of each of the small domains into which the control domain is divided. Therefore, the NN (neural network) construction serving as an estimator in the consequent part of the operation is reduced in size, and so the NN output can be checked in advance, whereby an operation state X whose estimate error in the entire operation domain is within an allowable range can be obtained. Hence, the stability of the control system can be assured without providing a limiter or the like.
According to a second aspect of the present invention, there is provided a parameter estimation method for estimating parameters relating to input and output of a control object by using a neural network, and this method comprises the steps of: selecting at least one parameter which has strong correlation with non-linearity amongst input-output characteristics of the control object, as a parameter of a premise part of fuzzy operation, and dividing a control domain of the control object into a plurality of small domains by fuzzy estimation based on the parameter of the promise part; and estimating dynamic behavior of the control object by using, as a consequent part of the fuzzy operation, a neural network which receives a parameter indicating the operation state of each of the small domains into which the control domain is divided. Therefore, the same effects as described for the first aspect can be obtained.
According to a third aspect of the present invention, there is provided a parameter estimation device for estimating parameters relating to input and output of a control object by using a neural network, and this device comprises: control domain division means for selecting at least one parameter which has strong correlation with non-linearity amongst input-output characteristics of the control object, as a parameter of a premise part of fuzzy operation, and dividing a control domain of the control object into a plurality of small domains by fuzzy estimation based on the parameter of the premise part; and estimation means comprising a map formation unit and a map composition unit, and estimating dynamic behavior of the control object by using map values output from the map combination unit. The map formation unit performs as follows: dividing plural parameters indicating the operation states of the respective small control domains into a predetermined number of parameter groups to be input to each of plural neural networks; dividing output parameters from each of the plural neural networks into a predetermined number of parameter groups; constructing, as a consequent part of the fuzzy operation, neural networks receiving these parameter groups; performing learning of a neural network construction obtained by combining plural neural networks which are obtained by repeating the above operation until the output from each neural network becomes single; and forming maps by using the outputs from the respective neural networks corresponding to the respective input parameter groups. The map combination unit combines the maps so formed in the same manner as combining the neural networks. Therefore, the same effects as described for the first aspect can be obtained.
According to a fourth aspect of the present invention, in the parameter estimation device of the third aspect, the neural network construction has a structure in which a plurality of three-layer networks, each having two inputs and one output, are combined. Therefore, the NN (neural network) construction serving as an estimator in the consequent part of the operation is reduced to a three-layer NN construction having two inputs and one output, whereby the same effects as described above are obtained.
According to a fifth aspect of the present invention, in the parameter estimation device of the third aspect, when determining dots of the maps when mapping the neural network construction used for the operation in the consequent part, these dots are determined by calculating the dot pitch in accordance with the maximum differential coefficients of the neural network and an allowable estimate error. Therefore, it is possible to form maps that can assure the estimate error.
According to a sixth aspect of the present invention, there is provided a parameter estimation device for estimating parameters relating to input and output of a control object by using a neural network, wherein a sudden change in an output from a neural network which has learned is decided as noise, and a signal obtained by filtering the output of the neural network is used as an estimate value. Therefore, only a sudden change can be effectively removed without delaying the propagation timing such as the phase of the estimate value, thereby avoiding considerable degradation of control precision.
According to a seventh aspect of the present invention, in the parameter estimation device of the sixth aspect, a teaching signal used for learning of the neural network is given after adjusting its phase and gain, with regard to the filter characteristics. Therefore, the same effects as described for the sixth aspect can be obtained.
According to an eighth aspect of the present invention, in the parameter estimation device of the seventh aspect, to adjust the phase and gain of the teaching signal is to filter the teaching signal by an inverse model of a filter used for output to make a new teaching signal. Therefore, the same effects as described for the seventh aspect can be obtained.
According to a ninth aspect of the present invention, there is provided a parameter estimation device for estimating parameters relating to input and output of a control object by using a neural network, and this device comprises: control domain division means for selecting at least one parameter which has strong correlation with non-linearity amongst input-output characteristics of the control object, as a parameter of a premise part of fuzzy operation, and dividing a control domain of the control object into a plurality of small domains by fuzzy estimation based on the parameter of the premise part; and estimation means for estimating dynamic behavior of the control object by using, as a consequent part of the fuzzy operation, a neural network which receives time series data of the input and output of the control object for each of the small domains into which the control domain is divided. Therefore, analysis of stability is realized as in the case where the output can be observed. So, even when the NN (neural network) input term cannot be given by only input and output data of the control object, analysis of stability is possible by considering the NN input term as a disturbance term. Hence, stable NN estimate feed-back gains, which have conventionally been decided by trial and error, can be decided by calculation using weight coefficients of the NN.
According to a tenth aspect of the present invention, there is provided a parameter estimation device for estimating parameters relating to input and output of a control object by using a neural network, and this device comprises: control domain division means for selecting at least one parameter which has strong correlation with non-linearity amongst input-output characteristics of the control object, as a parameter of a premise part of fuzzy operation, and dividing a control domain of the control object into a plurality of small domains by fuzzy estimation based on the parameter of the premise part; and estimation means for estimating dynamic behavior of the control object by using, as a consequent part of the fuzzy operation, a neural network which receives time series data of the input of the control object and time series data of a neural network output for each of the small domains into which the control domain is divided. Therefore, the same effects as described for the ninth aspect can be obtained.
According to an eleventh aspect of the present invention, there is provided a parameter estimation device for estimating parameters relating to input and output of a control object by using a neural network, and this device comprises: control domain division means for selecting at least one parameter which has strong correlation with non-linearity amongst input-output characteristics of the control object, as a parameter of a premise part of fuzzy operation, and dividing a control domain of the control object into a plurality of small domains by fuzzy estimation based on the parameter of the premise part; and estimation means for estimating dynamic behavior of the control object by using, as a consequent part of the fuzzy operation, a neural network which receives time series data of the input of the control object and time series data of an estimate value of the output of the control object, which is obtained by an estimator, for each of the small domains into which the control domain is divided. Therefore, the same effects as described for the ninth aspect can be obtained.
According to a twelfth aspect of the present invention, there is provided a parameter estimation device for estimating parameters relating to input and output of a control object by using a neural network, and this device comprises: control domain division means for selecting at least one parameter which has strong correlation with non-linearity amongst input-output characteristics of the control object, as a parameter of a premise part of fuzzy operation, and dividing a control domain of the control object into a plurality of small domains by fuzzy estimation based on the parameter of the premise part; and estimation means for estimating dynamic behavior of the control object by using, as a consequent part of the fuzzy operation, a neural network which receives operation state parameters including at least one of time series data of the input and output of the control object and time series data of an estimate value of the output, for each of the small domains into which the control domain is divided. Therefore, the same effects as described for the ninth aspect can be obtained.
According to a thirteenth aspect of the present invention, in the parameter estimation device of the ninth aspect, a non-linear function f(x) used for a middle layer and an output layer of the neural network is represented by two linear functions, whereby the dynamic behavior of the neural network is given by a model which is represented by parameter of the linear functions and coefficients obtained by product-sum operation of connection coefficients of the neural network, and a control system is designed by using this model. Therefore, the NN (neural network) model can be converted to a model which can be stably analyzed, and stable NN estimate feed-back gains can be theoretically decided by using weight coefficients of the NN. Further, since a set value is provided for each control domain so that a value input to the non-linear function becomes lower than the set value, the width of the two linear functions can be reduced, whereby the result of the decision is prevented from being conservative.
According to a fourteenth aspect of the present invention, in the parameter estimation device of the thirteenth aspect, one of the two linear functions is given by a straight line which passes an origin and has a gradient of maximum differential coefficients xcex1 of the non-linear function, while the other is given by a straight line which passes the origin and has a gradient of xcex2. Therefore, the same effects as described for the thirteenth aspect can be obtained.
According to a fifteenth aspect of the present invention, in the parameter estimation device of the thirteenth aspect, the gradient xcex2 of the straight line which passes the origin satisfies 0xe2x89xa6xcex2xe2x89xa6xcex1 and is given by xcex2=(f(x1))/x1. Therefore, the same effects as described for the thirteenth aspect can be obtained.
According to a sixteenth aspect of the present invention, there is provided a parameter estimation control device for estimating parameters relating to control of a control object by using a neural network, and controlling the control object according to the estimated parameters, and this device comprises: parameter estimation means for receiving a state quantity indicating the state of the control object, and generating an estimate value of the output of the control object according to the input state quantities, by using a neural network which has learned by using the result of detection from a predetermined sensor means as a teaching signal; and control quantity operation means for receiving the estimate value generated by the parameter estimation means, and generating a control quantity used for control of the control object, based on the estimate value, according to a variation adaptive operation process which is adaptive to a variation of the estimate value. Therefore, stable control can be performed based on the estimate value generated by the parameter estimation means.
According to a seventeenth aspect of the present invention, in the parameter estimation control device of the sixteenth aspect, the variation adaptive operation process performed by the control quantity operation means is a control object model adaptive operation process in which a controller adaptive to a control object model is designed by using the control object model. Therefore, stable control can be performed by designing a virtual controller adaptive to the variation by using the control object model, and performing processing based on the controller.
According to an eighteenth aspect of the present invention, in the parameter estimation control device of the seventeenth aspect, the controller is designed so that it performs stable control according to, as the variation, the maximum error detected with respect to the neural network which has learned. Therefore, the same effects as described for the seventeenth aspect can be obtained.
According to a nineteenth aspect of the present invention, the parameter estimation control device of the sixteenth aspect further comprises: a hardware sensor means for detecting the output of the control object; and the parameter estimation means generating an estimate value in accordance with the result of detection from the hardware sensor as well as the input state quantity. Therefore, stable control can be performed by using an inexpensive sensor having relatively low precision, whereby the number of processes required for learning of the neural network can be reduced.
According to a twentieth aspect of the present invention, in the parameter estimation control device of the sixteenth aspect, the variation adaptive operation process performed by the control quantity estimation means is based on an estimate value generated by a neural network which has learned by using the result of detection by a predetermined sensor means as a teaching signal. Therefore, even when an existing control object model cannot be used, stable control is realized by using a control object model substitute neural network.
According to a twenty-first aspect of the present invention, there is provided a parameter estimation control method for estimating parameters relating to control of a control object by using a neural network, and controlling the control object according to the estimated parameters, and this method comprises the steps of: making a neural network learn by using the result of detection from a predetermined sensor means as a teaching signal; generating an estimate value of the output of the control object in accordance with a state quantity indicating the state of the control object, by the neural network which has learned; and generating a control quantity used for control of the control object, based on the generated estimate value, according to a variation adaptive operation process which is adaptive to a variation of the estimate value. In this method, since a control quantity used for controlling the control object is generated by processing adaptive to variations of the estimate value generated, stable control is realized based on the estimate value.
According to a twenty-second aspect of the present invention, in the parameter estimation control method of the twenty-first aspect, the variation adaptive operation process performed by the control quantity operation means is a control object model adaptive operation process in which a controller adaptive to a control object model is designed by using the control object model. Therefore, the same effects as described for the seventeenth aspect can be obtained.
According to a twenty-third aspect of the present invention, in the parameter estimation control method of the twenty-second aspect, the controller is designed so that it performs stable control according to, as the variation, the maximum error detected with respect to the neural network which has learned the result from the sensor means. Therefore, the same effects are described for the twenty-second aspect can be obtained.
According to a twenty-fourth aspect of the present invention, the parameter estimation control method of the twenty-first aspect further comprises: detecting the output of the control object by using a hardware sensor means; and generating an estimate value in accordance with the result of detection from the hardware sensor means as well as the input state quantity. Therefore, the same effects as described for the nineteenth aspect can be obtained.
According to a twenty-fifth aspect of the present invention, in the parameter estimation control method of the twenty-first aspect, the variation adaptive operation process performed by the control quantity estimation means is based on an estimate value generated by a neural network which has learned the result of detection by a predetermined sensor means as a teaching signal. Therefore, the same effects as described for the twentieth aspect can be obtained.
According to a twenty-sixth aspect of the present invention, there is provided a learning control device comprises: a neural network for receiving a plurality of input parameter values relating to a parameter as an object of estimation, and estimating estimation object parameter values used for target follow-up control quantity operation from these input parameter values; error coefficient change means for changing values of weight coefficients by which a square error between a neuro estimate value output from the neural network and a teaching signal is to be multiplied, according to relationships among the neuro estimate values the teaching signal, and a target value; and performance function operation means for operating a performance function for learning by using the weight coefficients; wherein learning of the neural network is performed based on the performance function. Therefore, learning with regard to a target value can be performed, whereby precision of target follow-up control by using a neuro estimate value can be improved as compared with that of the conventional learning control. Especially, it is possible to prevent the control precision from degrading in the transient state where the control object changes suddenly.
According to a twenty-seventh aspect of the present invention, there is provided a learning control method receiving a plurality of input parameter values relating to a parameter as an object of estimation, and performing learning control of a neural network which estimates estimation object parameter values used for target follow-up control quantity operation from those input parameter values, and this method comprises the steps of: changing values of weight coefficients by which a square error between a neuro estimate value output from the neural network and a teaching signal is to be multiplied, according to relationships among the neuro estimate value, the teaching signal, and a target value; operating a performance function for learning by using the weight coefficients; and performing learning of the neural network based on the performance function. Therefore, the same effects as described for the twenty-sixth aspect can be obtained.
According to a twenty-eighth aspect of the present invention, there is provided a learning control device comprises: a neural network for receiving a plurality of input parameter values relating to a parameter as an object of estimation, and estimating estimation object parameter values used for target follow-up control quantity operation from these input parameter values; estimation error sign decision means for deciding the kind of a sign of an error between a neuro estimate value output from the neural network and a target value; teaching error sign decision means for deciding the kind of a sign of an error between a teaching signal and the target value; sign comparison means for comparing the kinds of the respective signs decided by the estimation error sign decision means and the teaching error sign decision means; square error coefficient change means for increasing values of weight coefficients by which a square error between the neuro estimate value and the teaching signal is to be multiplied, when the sign comparison means decides that the signs are different from each other; and performance function operation means for operating a performance function for learning by using the weight coefficients; wherein learning of the neural network is performed based on the performance function. Therefore, calculation of correction quantity in the inverse direction, which occurs in a neuro control system obtained by conventional learning, is suppressed, whereby the control precision can be improved.
According to a twenty-ninth aspect of the present invention, there is provided a learning control device comprises: a neural network for receiving a plurality of input parameter values relating to a parameter as an object of estimation, and estimating estimation object parameter values used for target follow-up control quantity operation from these input parameter values; estimation error sign decision means for deciding the kind of a sign of an error between a neuro estimate value output from the neural network and a target value; estimation error absolute value calculation means for calculating an absolute value of the error between the neuro estimate value and the target value; teaching error sign decision means for deciding the kind of a sign of an error between a teaching signal and the target value; teaching signal error absolute value calculation means for calculating an absolute value of the error between the teaching signal and the target value; sign comparison means for comparing the kinds of the respective signs decided by the estimation error sign decision means and the teaching error sign decision means; absolute value comparison means for comparing the absolute values calculated by the estimation error absolute value calculation means and the teaching signal error absolute value calculation means; square error coefficient change means for increasing values of weight coefficients by which a square error between the neuro estimate value and the teaching signal is to be multiplied, when the sign comparison means decides that the kinds of the signs are identical, and the absolute value comparison means decides that the absolute value of the error between the neuro estimate value and the target value is smaller than the absolute value of the error between the teaching signal and the target value; and performance function operation means for operating a performance function for learning by using the weight coefficients; wherein learning of the neural network is performed based on the performance function. Therefore, the same effects as described for the twenty-eighth aspect can be obtained.
According to a thirtieth aspect of the present invention, there is provided a learning control device comprises: a neural network having a plurality of neuro constructions having their respective learning conditions, which receive a plurality of input parameter values relating to a parameter as an object of estimation, and estimate estimation object parameter values used for target follow-up control quantity operation from these input parameter values; state quantity detection means for detecting a state quantity relating to dynamic behavior of a control object; learning condition decision means for deciding the learning condition of the neural network in accordance with the detected value; and neuro selection means for selecting only a neuro output value corresponding to the present condition amongst the neuro constructions of the neural network in accordance with the detected value. Therefore, a construction performing divided learning is realized. Even when the dynamic behavior under the initial condition changes (e.g., change of the behavior of the control object with time), degradation of precision can be avoided by re-learning of a neuro construction adaptive to the condition where the behavior has changed.
According to a thirty-first aspect of the present invention, in the learning control device of the thirtieth aspect, when the present state quantity satisfies a specific learning condition, the neuro selection means multiplies a neuro output corresponding to the condition by a step function xe2x80x9c1xe2x80x9d, and multiplies other neuro outputs by a step function xe2x80x9c0xe2x80x9d, thereby selecting the neuro output. Therefore, the same effects as described for the thirtieth aspect can be obtained.
According to a thirty-second aspect of the present invention, there is provided a learning control device comprises a neural network having a plurality of neuro constructions having their respective learning conditions, which receive a plurality of input parameter values relating to a parameter as an object of estimation, and estimate estimation object parameter values used for target follow-up control quantity operation from these input parameter values. This device comprises: state quantity detection means for detecting a state quantity relating to dynamic behavior of a control object; learning condition decision means for deciding the learning condition of the neural network in accordance with the detected value; learning data storage means for storing learning data sets which have learned connection coefficients of the respective neuro constructions; learning data formation means for forming learning data for each of the learning conditions in accordance with the state quantities; learning means for performing learning of neuro connection coefficients, by using the learning data set corresponding to the condition stored in the learning data storage means, and the learning data newly formed by the learning data formation means; and coefficient updation/neuro selecting means for selecting a neuro construction corresponding to the condition and updating the neuro construction to the connection coefficients which are the learning result. Throughout this disclosure, the word xe2x80x9cupdationxe2x80x9d means the same as xe2x80x9cupdate.xe2x80x9d Therefore, even when the behavior changes and thereby the neuro estimation precision deteriorates, learning of only a neuro construction under the corresponding condition can be performed online, whereby learning is performed so that satisfactory control is achieved.
According to a thirty-third aspect of the present invention, in the learning control device of the thirty-second aspect, the learning data sets used for actual learning by the learning means are obtained by deleting old learning data sets as many as latest learning data sets, whereby the number of the learning data sets is constant. Therefore, unwanted increase in the learning time is avoided, and influence of the past data can be minimized to learn the latest condition.
According to a thirty-fourth aspect of the present invention, in the learning control device of the thirty-second aspect, the learning data sets stored in the learning data storage means are learning data sets which are always updated in the online state where the neural network itself performing neuro operation is an object of learning. Therefore, the same effects as described for the thirty-third aspect can be obtained.
According to a thirty-fifth aspect of the present invention, there is provided a learning control device comprises: a neural network having a plurality of neuro constructions having their respective learning conditions, which receive a plurality of input parameter values relating to a parameter as an object of estimation, and estimate estimation object parameter values used for target follow-up control quantity operation from these input parameter values; state quantity detection means for detecting a state quantity relating to dynamic behavior of a control object; learning condition decision means for deciding the learning condition of the neural network in accordance with the detected value; learning data storage means for storing learning data sets which have learned connection coefficients of the respective neuro constructions; learning data formation means for forming new learning data for each of the learning conditions in accordance with the state quantity; learning data set formation means for forming learning data sets for the respective neuro constructions, by using the learning data sets of all the neuro constructions corresponding to the learning condition stored in the learning data storage means, and the learning data newly formed by the learning data formation means; learning data updation means for updating the corresponding data in the learning data storage means to the newly formed learning data sets; learning means for performing learning of all the neuro connection coefficients corresponding to the learning condition by using the newly formed learning data sets; coefficient updation/neuro selection means for selecting all the neuro constructions and updating them to the connection coefficients which are the learning result; neuro construction selection means for selecting all the neuro constructions corresponding to the learning condition; and estimate calculation means for calculating a neuro estimate value to be used for control, from all the corresponding neuro outputs; wherein the estimate value is used for control quantity operation. Therefore, even when the dynamic behavior changes over plural conditions, the neuro control quantity is smooth, whereby satisfactory control is achieved.
According to a thirty-sixth aspect of the present invention, in the learning control device of the thirty-fifth aspect, in the neural network, neuro constructions under adjacent conditions have a common part of representation of control object dynamic behavior. Therefore, the same effects as described for the thirty-fifth aspect can be obtained.
According to a thirty-seventh aspect of the present invention, in the learning control device of the thirty-fifth aspect, the estimate calculation means calculates the average of all the corresponding neuro output values, as the neuro estimate value to be used for control. Therefore, the same effects as described for the thirty-fifth aspect can be obtained.