1. Field of the Invention
The present invention relates to an optimization method for determining a minimum value of an optimization function under constraints given by equations, to an apparatus therefor, and to a program therefor.
2. Description of the Related Art
A problem of minimizing a real-valued function E(r) under constraints specified by the equations is considered as,Si(r)=0,i=1, . . . ,m,  (1)where it is assumed that r is a n-dimensional real vector, that is:
                              r          =                      (                                                                                r                    1                                                                                                                    r                    2                                                                                                ⋮                                                                                                  r                    n                                                                        )                          ,                  r          1                ,                  r          2                ,        ⋯        ⁢                                  ,                              r            n                    ∈                      R            .                                              (        2        )            The problem of maximizing the real-valued function reduces to the above-described problem by inverting the sign of the function.
When there are no constraints, as a commonly used algorithm for solving such a problem, starting from an appropriate initial vector r(0), a point r at which the function E(r) reaches a minimum is searched for by using a gradient vector F=−∇E(r).
In recent years, the conjugate gradient method, including the Fletcher-Reeves method, the Polak-Ribiere method, etc., has also been used. The details of these algorithms are described in “Computational Methods in Optimization”, by E. Polak, 1971, published by Academic Press. These methods have the advantages that, because a search direction orthogonal to the direction of a previously performed search is selected, the search is performed efficiently, and the algebraic properties of the Hestnes-Stiefel conjugate gradient method can be taken advantage of where the function E(r) can be approximated by a second order. Accordingly, these methods are frequently used.
When constraints are added, since such simple algorithms do not succeed, various contrivances have been made. As representative examples, the penalty method and the multiplication method (the Lagrangian method of undetermined coefficients) are well known. These methods are described in detail in, for example, Iwanami Lectures on Applied Mathematics, “Optimization Methods”, by Hiroshi FUJITA, Hiroshi KONNO, and Kunio TANABE, 1994, published by Iwanami Shoten.
In the penalty method, to express constraints, a penalty function P(r) which becomes 0 when the constraint is satisfied and which becomes a very large number when the constraint is not satisfied is introduced so that a new function,Ω(r)=E(r)+P(r)  (3)is minimized. A specific form of the penalty function may be considered as,
                                          P            ⁡                          (              r              )                                =                                    ∑                              i                =                1                            m                        ⁢                                                  ⁢                                          ω                i                            ⁢                                                                    S                    i                                    ⁡                                      (                    r                    )                                                  2                                                    ,                            (        4        )            where ωi is a positive number and is adjusted to an appropriate value during the process of searching for a minimum value.
In the multiplication method, an undetermined multiplier λi is introduced so that a new function of r and λi,
                                          Ω            ⁡                          (                              r                ,                                  λ                  i                                            )                                =                                    E              ⁡                              (                r                )                                      +                                          ∑                                  i                  =                  1                                m                            ⁢                                                          ⁢                                                λ                  i                                ⁢                                                      S                    i                                    ⁡                                      (                    r                    )                                                                                      ,                            (        5        )            is made stationary. The constraint is expressed as a stationary condition of Ω(r, λi) with respect to λi,
                                          ∂                          Ω              ⁡                              (                                  r                  ,                                      λ                    i                                                  )                                                          ∂                          λ              i                                      =        0.                            (        6        )            
In addition to these methods, an algorithm using a dynamic system in which a vector field is considered inside a space under consideration and the optimum point is an asymptotic solution of the vector field is also being investigated (described in the above-described “Optimization Methods” in the Iwanami Lectures on Applied Mathematics.)
In addition, recently, Smith et al. have proposed an algorithm in which an admissible set, that is, a set of points which satisfy constraints, is regarded as a Riemannian manifold, and based on that, a Newtonian method and a conjugate gradient method have been considered (“Optimization Techniques on Riemannian Manifolds”, S. T. Smith, AMS, Fields Institute Communications, Vol. 3, 1994, pp. 113–136). Furthermore, regarding a case in which the admissible set is a Grassmann manifold or a Stiefel manifold, a more detailed algorithm has been proposed “The Geometry of Algorithms with Orthogonality Constraints”, A. Edelman, T. A. Arias, S. T. Smith, SIAM J. Matrix Anal. Appl. Vol. 20, 1998, pp. 303–353). Since this algorithm becomes basically the same as the case in which there are no constraints, it has the feature that an additional variable or an adjustment parameter, such as those which appear in other methods, is unnecessary. This technique provides a new point of view with respect to optimization methods with constraints, and a wide range of applications to eigenvalue problems, etc., can be expected.
As described in the conventional methods, as an algorithm for an optimization problem under equation constraints, the multiplication method and the penalty method are widely used. The former becomes a saddle search problem, and the calculation algorithm becomes complex compared to an extremal-value search. The latter has a problem in that the selection of a penalty parameter must be contrived according to the problem.
Furthermore, in the algorithm using the above-described dynamic system, a contrivance for setting an appropriate dynamic system is required.
Furthermore, in the above-described algorithm of Smith et al., during the process of searching for the optimum point, a point is moved along a geodesic line on an admissible set. Smith et al. specifically form a solution of a geodesic line equation with regard to the cases of the Grassmann manifold and the Stiefel manifold. From the viewpoint of principles, if a point is moved in accordance with this solution, the point should stay within the admissible set. In practice, however, there is a problem in that, due to errors in numerical calculations, the point deviates from the admissible set. Furthermore, for a case in which the admissible set is not one of these manifolds, a mathematical expression of a specific solution of a geodesic line equation is not given.