Not applicable
Not applicable
The invention relates to methods and systems for training learning systems that deal with dynamical problems.
Given a reference system that processes an external input x(t) and produces a corresponding observed output y*(t), the learning problem consists in designing a learning system that adjusts its parameters such that it is capable of processing the same external input x(t) and producing a generated output yN(t) that is arbitrarily close to the observed output y*(t) (see FIG. 1).
What makes this problem difficult is that many times nothing is known about the reference system and only samples of the external input x(t) fed into the reference system and the observed output y*(t) are available.
Generally speaking, the learning problem requires selecting an ensemble of basic building blocks for the learning system, choosing the architecture that connects these elements, designing a learning procedure to adjust the parameters of the ensemble, collecting the samples needed for the self-organizing process, and deciding when learning has to stop.
According to the type of external input and observed output the learning problem can be divided into two main classes: static problems and dynamic problems. Static problems are those where the external input and the observed output do not change in time while the reference system processes the former and generates the latter. Dynamic problems are those in which time does play an essential role: the external input and the observed output can change through time.
Analogously, learning systems can be divided into two categories: static and dynamic systems. Static systems do not have recurrent connections, which is why as long as their input does not change, their output does not change too. They cannot generate dynamic behavior. On the other side, dynamic systems have recurrent connections and can display dynamic behavior even in the absence of inputs.
The previous classifications are relevant if one considers that static problems can be solved using static or dynamic systems, but, dynamic problems can only be solved using dynamic systems.
In order to design systems that learn to deal with static problems, it suffices with designing a static system based on current feed-forward neural network theory. Feed-forward neural networks do not have feedback connections, which is why they are called Non Recurrent Neural Networks (NRNN) in the ensuing description. There are several proofs that ensure the convergence of these neural networks to the desired solutions [1,2,3,4,5,6]. There is a plethora of training methods that produce the desired results in practical amounts of time [5,6,7]. Several refinements have been designed to speed up neural network convergence [8,9,10]. Finally, existing theoretical results allow understanding how the architecture and training of these neuronal assemblies affect their generalization capabilities [6,11]. Despite there is still much work to be done in this field, feed-forward neural network theory provides a solid base that can be used to design practical self-organizing systems that successfully deal with static problems.
There is an enormous interest in discovering ways of designing systems that learn how to deal with dynamical processes just using samples taken from those processes. Welding robots in car-making factories pose a relatively simple problem because their tasks are very structured: the locations at which the welding torches have to be positioned are known a priori. The same is true for all well-defined dynamical problems in robotics and other areas as well. On the other hand, to program a robot that peels a potato with a knife is extremely difficult. Potatoes come in different shapes, the thickness of their skin varies, the quality of the blade of a knife changes through time, etc. The natural variability of the components makes peeling a potato with a knife a very unstructured dynamical problem. Due to this inherent variability the problem is vary difficult to describe, therefore difficult to reduce to formal expressions and solve using some heuristics. The same is true for all highly unstructured problems in robotics and other areas as well. Because formal approaches fail in practice to provide solutions for a potato peeler system, an alternative approach would be to design a self-organizing system that uses samples taken from a person peeling potatoes. The same idea could be used in other unstructured dynamic problems as well.
Two important dynamical problems are trajectory generation and dynamical function-mapping. The trajectory generation problem requires designing a self-organizing system that learns from examples how to duplicate a spatio-temporal trajectory generated by a reference system. The dynamical function-mapping problem is a generalization of the trajectory generation problem: it requires designing self-organizing systems capable of learning how to map a set of spatio-temporal trajectories in some input space to a corresponding set of spatio-temporal trajectory in some output space according to the dynamics dictated by a reference system.
As stated before, current knowledge about static problems allows the design of practical solutions. The same cannot be said about dynamical problems, whose solutions require dynamic systems. In spite of the long lasting effort seen on the automatic control and neural network arenas, there are still no practical solutions for many of the problems of this class. In particular, all the effort spent in developing new Recurrent Neural Network (RNN) architectures, neural networks with feedback connections, and training techniques, has still not produced practical solutions for many of the dynamical problems. Even though it has been proven that a RNN can approximate any known dynamic system [12], despite several techniques for dynamical problems have been developed [13,14,15,16,17], although several optimizations have been done [18,19,20,21,22,23,24], it is still not possible to design RNNs that learn in a practical amount of time to produce random trajectories or map complex spatio-temporal spaces. In many of these cases the solution spaces are plagued with local minima and it is not always possible to find solutions [25]. Gradient descent techniques do not perform very well because gradients tend to vanish as time passes [26]. Non-gradient descent techniques [27,28,29] have been suggested, but their convergence to useful results is not always guaranteed or the time needed to find the solutions grows to impractical lengths. Some techniques work very well [30,31], but how to scale them up to higher dimensional spaces is still unknown. Other techniques work fine in any space, but are limited to tackle static problems [32,33].
Even though some patents address the trajectory generation and dynamic mapping problems, they fall short from giving a full-fledged solution. In [34] the generated output of the learning system is used as external input but nothing is done to guarantee the stability of the system. Reference [35] presents a similar approach where the generated output controls a robotic hand, which sends sensor information back into the inputs as an external input. The difference in this case is that the generated output is riot used as an external input directly, but through the robotic gripper system. In patent [36] the derivatives of the generated outputs are part of the external input but nothing is said about the stability of the problem or the behavior of the system in low signal to noise environments. Reference [37] presents a training method for fully connected RNNs but it does not work very well with simple trajectory generation problems. Patent [38] differs from the previously presented patents in that it uses a probabilistic framework instead of a neural network architecture. But as the previous patents, it also includes the generated output in the external input and ignores the stability issues. Finally, reference [39] presents a method limited to on-line trajectory learning. A common pattern in all these patents is that they either focus on the trajectory generation problem and do not address the dynamical function-mapping problem, or they ignore the stability problem offering partial solutions not guaranteed to work in an arbitrary problem.
Summing up, existing literature and granted patents:
Focus mostly in the trajectory generation problem and do not address the more general case: the dynamical function-mapping problem.
Do not provide a simple and practical solution for dynamical problems in general. Some of the solutions work for simple trajectory generation problems but how they scale to higher dimensionalities is not known. Others provide general solutions but they operation is not very satisfactory.
Most approaches ignore the stability problem and cannot guarantee convergence of the learning systems to a solution. This fact renders most of these approaches useless when it comes to designing all-purpose learning machines.
The invention consists of a learning system based on the Dynamical System Architecture (DSA). Every implementation of the DSA is composed of a dynamic system and a static system. The dynamic system generates an intermediate output o(t), which together with an external input x(t) is fed into the static system, which finally produces a generated output yN(t). Every time the dynamic system is reset it produces the same intermediate output o(t), which does not cross over itself during the temporal span each of the events generated by a reference system whose behavior has to be duplicated. The static system can be anything that behaves like a trainable universal function approximator. Training uses the intermediate output o(t), the external input x(t), the observed output y*(t) produced by the reference system whose behavior is going to be mimicked, and a self-organizing procedure that only modifies the parameters of the static system. Training of this dynamic system is not necessary.
The invention offers a simple and practical solution for the dynamic problems, specifically for the trajectory generation problem and its generalization: the dynamical function-mapping problem. This learning system is simple, easy to implement using existing techniques, scaling to higher dimensionalities is straightforward compared to other approaches, and provides an excellent performance compared to that offered by the prior art. If a stable dynamic system is used, stability of an implementation of the DSA stops being a concern.
Furthermore, an implementation of the DSA has the additional advantage in that:
It divides the learning system into a dynamic system and a static system, and reduces the problem to training a static problem.
The same dynamic system can be used for all the dynamical problems, which means that the dynamic system can be previously synthesized. This makes unnecessary to train the dynamic system.
The conditions imposed on the dynamic system by the DSA architecture make possible to choose very simple dynamic systems.
There are several alternatives for implementing the dynamic system: software routines, differential equation systems, RNNs, specific hardware, physical systems, etc. The dynamic system is not constrained to be a RNN or some probabilistic framework as it is in the prior art.
Any universal function approximator architecture can be used for the static system. The DSA does not require any of these in special: any of them is equally suitable.
The fact that the dimensionality of the external input and the generated output only affects the static system implies that scaling an implementation of the DSA to higher dimensionalities is much simpler than doing it with an arbitrary RNN.
The absence of feedback going from the static system to the dynamic system implies that if a stable dynamic system is synthesized, the learning system is inherently stable.
There are many training approaches that can be used with an implementation of the DSA: it is just matter of choosing some universal function approximator architecture and the corresponding training procedure.
Further objects and advantages of the invention will become more clear after examination of the drawings and the ensuing description.