1. Field of the Invention
The present invention relates to a system identifying device for precisely representing the features of a system as a mathematical model in a learning control field such as a robot manipulator, various industrial plants, etc.
2. Description of the Related Art
To apply a conventional control method to a control object system and practically controlling it, the features of the system should be precisely represented by a mathematical model. To attain this, an algorithm of identifying a system has been developed for linear and non-linear systems.
Recently, there has been an increase in the number of studies in controlling a robot and recognizing an object using the learning abilities of a neural network. The neural network methods relate to technologies of recognizing an object by repeatedly learning given data even if a system model is indecisive or completely unknown.
However, the control method operating with a neural network normally requires learning data as inputs and outputs in pairs. That is, in response to input data, a neural network starts a learning using known output data as a teaching signal. Accordingly, if a neural network is used as a learning control device in a control field, then the number of applications is currently limited because a teaching signal cannot necessarily be obtained or a teaching signal generating method is unknown at present.
An example of inverse kinematics of a multidegrees-of-freedom manipulator is explained as follows.
Joints 1, 2, 3, . . . , n of a manipulator are positioned from nearest to farthest of a manipulator stand. The joint displacement angle of the i-th joint is set to qi. Thus, the joint displacement vector is represented as follows. EQU q=(q.sub.1, q.sub.2, q.sub.3 . . . q.sub.n).sup.t
where and hereinafter t indicates a transposition vector.
The vector representing a position of the tip of a hand of the manipulator is represented as follows. EQU x=(x.sub.1, x.sub.2, x.sub.3 . . . x.sub.m).sup.t
The relationship between the values are as follows. EQU x=f(q) (1)
If a joint displacement q (hereinafter referred to as vector q) is given, then the coordinate x (hereinafter referred to as vector x) of the tip of the hand can be easily obtained. The coordinates of an orbit of the tip of the hand is provided in an instruction during the operation. Therefore, vector q in response to designated vector x can be obtained by processing equation (1) in reverse. That is: EQU q=f.sup.-1 (x) (2)
The problem of solving equation (2) is called a problem of "inverse kinetics" in which vector q of a joint displacement angle vector is not necessarily associated with vector x of a hand tip position vector. Even if it really is associated with the hand tip position vector x, it is not always a unique value.
Thus, if a hand tip position of a manipulator is given, an associated joint displacement angle can be obtained by solving equation (2). However, this method is effective only for a manipulator of a simple-structure robot. It does not apply to a complex-structure robot manipulator having a large number of joints because the robot requires such a complicated analysis process that the process cannot be practically performed successfully. Thus, since the complex-structure robot manipulator cannot yield a teaching signal group to be used for control, it does not realize an adaptive control device comprising, for example, an adaptive data processing device.
When a target hand-tip position x.sub.d (hereinafter referred to as vector x.sub.d) is given, a controller should provide for a manipulator a joint angle displacement vector q such that it allows an actual hand-tip position vector x to match vector x.sub.d, both of which are different from each other when input. If the controller comprises a neural network, a teaching signal should be provided for the neural network as a correct solution of an inverse kinetics problem.
Thus, described below is an example of a conventional technology of solving an inverse kinetics problem, that is, applying a neural network for the purpose of obtaining an operation parameter and realizing control by identifying the system.
Firstly, an example of applying a neural network to identification of a system operation parameter is an application of a Hopfield network (refer to ISCIE May, 22-24, 1991). However, the example does not work in identifying a non-linear system, and does not relate to an application for control.
Next, an application for control is explained by referring to FIGS. 1 through 6.
In these figures, an adaptive control device is realized by a neural network. A learning device is identified with a straight line diagonally penetrating the box of the adaptive control device, and provides the adaptive control device with a control input error signal so that the adaptive control device can learn .data converting capabilities. "N" indicates an adaptive data processing device for realizing earning of the data converting capabilities of the adaptive control device.
FIGS. 1 through 4 show the conventional technology. Refer to D. Psaltis, A. Sideris and A. A. Yamamura: A Multilayered Neural Network, IEEE Control Systems Magazine, Vol.8, No.2, pp.17-21 (1988). In the magazine, an adaptive control device 101 receives target control amount vector x.sub.d and outputs control operation amount vector q to a controlled object 103 . An adaptive data processing device N 102 receives control status amount vector x output by the controlled object 103. With this configuration, data converting capabilities of the adaptive control device 101 and the adaptive data processing device N 102 are learned such that output signals of the adaptive control device 101 and the adaptive data processing device N 102, that is, vector q and vector q' hereinafter referred to as vector q' respectively, match each other. As a result, the adaptive control device 101 obtains the control operation amount vector q to correctly realize the control specified with control target amount vector x.sub.d and provides it for the controlled object 103.
The conventional technology shown in FIGS. 2A and 2B provides random control operation amount vector q for a controlled object as shown in FIG. 2A, actually operates it, obtains control status amount x of the controlled object, provides it for an adaptive control device 202, and has the adaptive control device 202 learn the data converting capabilities of the adaptive control devices such that a difference e (hereinafter referred to as vector e) between an output vector q' of the adaptive control device 202 and control operation amount vector q provided for the controlled object 201 can be minimized. When the controlled object 201 is actually controlled, the adaptive control device 202 having the learned data converting capabilities receives a control target amount vector x.sub.d equal to the control status amount as shown in FIG. 2B. At this time, the control operation amount vector q output by the adaptive control device 202 is provided for the controlled object 201 to perform a controlling operation such that the control status amount of the controlled object 201 equals a target value.
According to the conventional technology shown in FIG. 3, a calculator 303 for calculating an inverse Jacobian of the controlled object is provided when control operation amount vector q output by an adaptive control device 301 which receives control target amount vector x.sub.d is given to a controlled object 302. With the calculator 303, control operation difference .DELTA.q (hereinafter referred to as vector .DELTA.q) which varies with the difference between control target amount vector x.sub.d and control status amount vector x can be calculated. The data converting capabilities of the adaptive control device 301 are learned such that control operation difference vector .DELTA.q can be minimized. As a result, the adaptive control device 301 obtains control operation amount vector q to correctly realize the control specified by control target amount vector x.sub.d and provides it for the controlled object 302.
The conventional technology shown in FIG. 4 relates to a control method as the combination of the technologies shown in FIGS. 2A, 2B and 3. A calculator 404 for calculating an inverse Jacobian of the controlled object 403 is provided when control operation amount vector q output by a second adaptive control device 402 which receives control target amount vector x.sub.d is given to a controlled object 403. With the calculator 404, control operation difference .DELTA.q which varies with the difference between control target amount vector x.sub.d and control status amount vector x of a controlled object 403, can be calculated. The data converting capabilities of the second adaptive control device 402 are learned such that control operation difference vector .DELTA.q can be minimized. The data converting capabilities of the first adaptive control device 401 are learned such that an output vector q' of a first adaptive control device 401 can be equal to control operation amount vector q provided for the controlled object 403. In this case, a subtracter 405 calculates a difference .DELTA.q' between vector q and vector q' and provides it for the first adaptive control device 401. In this method, since a Jacobian normally represents the relationship between very small amounts, the method shown in FIG. 3 is not necessarily appropriate when there is a large difference between vector x.sub.d and vector x. Therefore, the method shown in FIG. 2A is used to minimize the difference between vector x.sub.d and vector x.
The technology shown in FIG. 5 (refer to A Hierarchical Neural Network Model for Voluntary Movement with Application to Robotics by M. Kawato, Y. Uno, M. Isobe, and R. Suzuki, published by IEEE Control Systems Magazine 8, 8-16 (1988)) describes a feedback control unit 502 which has a fixed gain K and receives a difference between control target amount vector x.sub.d and control status amount vector x of a controlled object 503. The sum of control operation amount vector q output by an adaptive control device 501 which receives control target amount vector x.sub.d and an error amount output by a feedback control unit 502 is provided for the controlled object 503. The data converting capabilities of the adaptive control device 501 are learned such that the error amount output by the feedback control unit 502 can be minimized. Thus, the adaptive control device 501 outputs control operation amount vector q with which the control specified with control target amount vector x.sub.d can be performed more precisely. The feedback control unit 502 generates the above described error amount based on an inverse Jacobian of the controlled object 503.
The conventional technology shown in FIG. 6 (refer to M. Jordan: In ref. 4. (ref. 4/Neural Networks for Control: ed. W. Thomas Miller, III et. al (1990)) describes an adaptive data processing device N 602 which receives control operation amount vector q output by an adaptive control device 601. A learning is made such that the adaptive data processing device N 602 becomes a sequential system having the same input/output characteristics as a controlled object 603. After sufficient learning, an error amount between control status amount x' (hereinafter referred to as vector x') output by the adaptive data processing device N 602 and control target amount vector x.sub.d is back-propagated with an internal status value of the data converting capabilities of the adaptive data processing device N 602 fixed so as to yield an input error. The input error is used as a learning signal in learning the data converting capabilities of the adaptive control device 601. Thus, the adaptive control device 601 outputs control operation amount vector q to realize control target amount vector x.sub.d.
However, the conventional technology shown in FIG. 1 requires that the data converting capabilities of the adaptive data processing device N 102 should be preliminarily learned prior to the learning of the data converting capabilities of the adaptive control device 101. Thus, the learning can undesirably result in offline learning. If an output signal of the adaptive control device 101 matches an output signal of the adaptive data processing device N 102 before learning is completed, then the data converting capabilities of the adaptive control device 101 cannot be successfully learned.
The conventional technology shown in FIG. 2B has the problem of offline learning. To change the data converting capabilities of the adaptive control device 202 into those of a right inverse system, a controlled object 201 should be provided with a large number of random control operation amount vectors q before a learning is completed. Furthermore, if any parameter of the controlled object such as the length of a link of a robot arm changes, then learning should be started again.
Since the conventional technology shown in FIGS. 3 and 4 requires that an inverse Jacobian is calculated by the calculators 303 and 404, offline learning may be performed. Furthermore, the knowledge of an inverse Jacobian for the controlled objects 302 and 403 is required to calculate in the calculations 303 and 404.
The conventional technology shown in FIG. 5 has a merit of realizing online learning. However, a fixed gain K of the feedback control unit 502 must be appropriately set to successfully learn the data converting capabilities of the adaptive control device 501. To set the fixed gain K, the knowledge of an inverse Jacobian for the controlled object 503 is required.
According to the conventional technology shown in FIG. 6, the data converting capabilities of the adaptive data processing device N 602 must be preliminarily learned before learning the data converting capabilities of the adaptive control device 601, thereby simply resulting in offline learning.
The problems of the above described conventional technologies are summarized as follows. Those technologies for identifying a system over a neural network have the problem that they cannot identify a non-linear system. Those technologies for obtaining a mathematical model have difficulty in setting an exact model of the system of a controlled object when the system has non-linear redundant degrees of freedom, and have the problem that a solution of inverse dynamics for the mathematical model cannot be uniquely determined.
When a neural network is used in adaptive control of, for example, a robot, a number of conventional technologies use an offline learning method which requires a re-learning of a network when the features of a control object change.
There has been a further problem that a target controlling operation cannot be performed unless a fixed gain is appropriately provided for a feedback control unit 502 in the control method shown in FIG. 5. Especially in a non-linear system it is all the more difficult to set a the fixed gain.