As an introduction to the types of issues which arise in the application of processing systems to matrix processing, a discussion of linear programming will be given, including a discussion of what techniques are used for solving linear programming problems and of what role matrix processing plays in those techniques. From this discussion, it will be evident how the method of the invention is applicable to improve matrix processing performance.
The field of linear programming came into existence in the years following World War II, in which various types of logistical problems, many extremely complex and involving up to thousands of variables, needed to be solved as cost-efficiently as possible.
A typical problem might be one of supplying various types of goods from a number of distribution sites to a number of destinations. Each destination has certain requirements regarding how many of each type of goods it needs to have on hand. For a given type of goods at a given location, a certain cost per unit shipped is associated with each distribution point. Generally, the cost is related to the distance between the distribution point and the destination, or the mode of transportation which must be used in shipping the goods. Also, there may be constraints on the quantity of goods shipped. For instance, a given distribution site might have a limited quantity of a certain type of goods on hand. Thus, a nearby destination having a greater demand for those goods is required to ship some of its requirements from a more distant, more costly, distribution site. Also, since linear programming problems often arise out of "real-world" situations, zero is often a lower bound on the value of a variable. That is, the quantity of goods shipped from a given source to a given destination will likely have zero as a realistic lower bound.
The problem, then, is to find values for the quantities of each type of goods shipped from each distribution point to each destination, such that the total cost required for all the shipments is minimized, but all of the destinations are supplied with their requirements of all types of goods. This problem, as with essentially all linear programming problems, is expressed mathematically in the following general form: ##EQU1## subject to: EQU Ax=b (2)
where EQU L.sub.j .ltoreq.x.sub.j .ltoreq.U.sub.j ( 3)
That is, find a vector of x values such that the sum of products of the x values and a set of coefficients is minimized, given a set of constraints on the x values.
The first of these three expressions may be through of as a cost function. In the example given above, the variables x.sub.j represent the quantities of goods shipped, and the variables c.sub.j represent the costs per unit associated with shipping the goods. Thus, each term of the sum is a total cost for shipping one type of goods from one distribution point to one destination. The sum of all such terms is the grand total cost for all shipments. It will be seen, then, that minimizing this grand total cost of all such shipments is the basic objective of a linear programming problem.
The constraint Ax=b defines the "ground rules" for the linear programming problem under which the cost is to be minimized. In this expression, b is a vector of constants, in which each constant is the total quantity of one of the types of goods required by the destination. The A matrix and the b vector represent information such as the requirements of each location for each type of goods, etc. Note that Ax=b may also be a set of inequalities, such as Ax.ltoreq.b, etc.
Finally, the last expression represents constraints, if any, on the quantities of goods shipped from a given distribution point. In many linear programming problems, the lower bound L.sub.j is zero. As described above, the upper bound U.sub.j is a constraint, which may limit the maximum value for the quantity of a given type of goods shipped from a given location, if that is all there is on hand at that location.
Ax=b represents a system of linear equations, where x is a vector of variables, and A is a two-dimensional array of coefficients. Since A, x, and b are vectors or matrices, it will be seen that the expression Ax=b is s matrix multiplication problem. It will thus be seen that matrix processing capability is fundamental to solving linear programming problems.
In practice, the coefficient matrix A is usually sparse. A matrix is said to be "sparse" if a large portion of its terms are zero.
Given a system of n equations in n variables, it is possible to solve the system of equations to obtain unique values for the variables, if such unique values exist. However, in most realistic linear programming problems, the number of variables greatly exceeds the number of equations. Thus, there are a large number of possible solutions to such a system of equations. Because of this, it is possible to attempt to find one of the possible solutions which minimizes the total cost.
If the variables x.sub.j are visualized as defining an n-dimensional solution space for the linear programming problem, then the range of possible solutions can be represented as an n-dimensional polyhedron within that solution space. Because the equations defined by Ax=b are linear, the faces of the solution polyhedron are planar. Faces of the solution polyhedron are defined by the limiting values of various ones of the variables x.sub.j. It can be proven that the minimum cost is given by a solution represented by one of the vertices of the polyhedron.
The concept of a solution polyhedron in a solution space is a key to conceptualizing the different strategies for solving linear programming problems, which will now be presented.
The Simplex Method
The classic text by Dantzig, "Linear Programming and Extensions," Princeton University Press, Princeton, N.J. (1963), teaches a method, called the "simplex method", for solving a linear programming problem, i.e., finding values for the vector of x variables such that the cost is minimized. In the simplex method, a solution represented by one of the vertices of the solution polyhedron is selected, more or less at random. Then, moves are made in a sequence from one vertex to another, each move decreasing the total cost, until finally a vertex is reached from which no additional movement can reduce the cost any further. This final vertex is the solution to the linear programming problem.
The simplex method is executed as follows: A group of variables equal to the number of equations are selected, more or less at random, and are designated "basic variables." The remaining variables are called "nonbasic" or "independent" variables. The independent variables are set to feasible values, i.e., values within the range defined by the upper and lower limits given in the above expression. Since the lower limit L.sub.j is often 0, feasibility generally means that the variables have nonnegative values.
Using algebraic techniques similar to those used in solving n equations in n variables, i.e., adding linear combinations of the equations, the system of linear equations is placed into "canonical form," in which each equation has a coefficient 1 for one of the basic variables, and coefficients 0 for all the other basic variables. Thus, placing the equations in canonical form is equivalent to solving for the basic variables. This solution, if it is feasible, represents one of the vertices of the solution polyhedron.
Next, a better solution is sought as follows: The value of one of the independent variables is varied. Varying the value of the independent variable has the effect of changing the values of other variables. If varying the independent variable reduces the total cost, that independent variable is varied in value until one of the basic variables reaches a limiting value, such as zero. That basic variable is now replaced by the independent variable whose value had been changed. The values for the resulting new set of basic variables define another vertex of the solution polyhedron, connected to the first vertex by an edge. Varying the value of the independent variable had the effect of moving along an edge of the solution polyhedron which runs from the first vertex to the new vertex.
The process just described is repeated, moving from vertex to vertex, until no further change to any of the independent variables results in a cost reduction. At this point, the values of the basic and independent variables define the optimal solution to the linear programming problem, and the simplex method has been completed.
Since there are often on the order of thousands of variables in a realistic linear programming problem, solutions are typically calculated on computers. Because of the large numbers of calculations involved in matrix algebra, and the large number of iterations required for a technique such as the simplex method, computational efficiency is an important objective in a computer-implemented linear programming solver.
In particular, an implementation of the simplex method involves certain specific types of matrix calculations. First, in the expression Ax=b, Ax is broken up into two terms, i.e., two matrix products, representing basic variables x.sub.b and independent variables x.sub.n, as follows: EQU Ax=Bx.sub.b +Nx.sub.n =b (4)
This expression is then solved for x.sub.b as follows: EQU x.sub.b =B.sup.-1 b-B.sup.-1 Nx.sub.n ( 5)
The cost z associated with the selected values for the variables is given b y EQU z=c.sup.T.sub.B x.sub.B +C.sup.T.sub.N x.sub.N ( 6)
It is necessary to use the matrix transposes c.sup.T.sub.b and c.sup.T.sub.n of the basic and independent portions of the cost coefficient matrix for these matrix multiplication expressions to be in proper form. Therefore, a cost differential d.sub.j which is associated with changing the value of one of the independent variables associated with the j-th column of the N matrix is given by the expression EQU d.sub.j =c.sub.j -c.sup.T.sub.B B.sup.-1 a.sub.j ( 7)
where c.sub.j is the cost, absent the change of value of the independent variable, a.sub.j is the j-th column of N, and B.sup.-1 is the inverse matrix of B. For a change of value of an independent variable to produce an overall reduction in cost, d.sub.j should have a value less than zero.
The above expression is calculated many times during the course of solving a linear programming problem. Therefore, the matrix product c.sup.T.sub.B B.sup.-1, which is a one-dimensional vector, is designated as .pi..sup.T, and referred to as the "shadow prices" or "pi values." During the course of the simplex method, the matrix product .pi..sup.T A, or .pi..sup.T a.sub.j for individual columns of the matrix A, is calculated many times. The expression .pi..sup.T A is referred to as "multiplication on the left," referring to the position of the matrix .pi..sup.T to be multiplied with the coefficient matrix A. This calculation is made for every iteration of the simplex method, to determine whether the change in value of an independent variable is reducing the total cost. For these calculations, which may number in the thousands, A remains constant, but the pi values change as different variables are swapped in and out of the set of basic variables.
A different matrix calculation is made, following each iteration, to check for the feasibility of the solution. The vector of solutions x is multiplied by the coefficient matrix, to evaluate Ax=b. This is referred to as "multiplication on the right," again referring to the position of the vector to be multiplied with the coefficient matrix A. In the simplex method, multiplication on the right is executed much less frequently than multiplication on the left. Accordingly, in vector processing implementations of linear programming software, optimization for multiplication on the left has produced good computational efficiency for the simplex method. While this optimization decreases the efficiency of multiplication on the right, overall computational efficiency is not greatly reduced, since multiplication on the right is performed relatively infrequently.
Interior Point Methods
More recently, alternative methods for solving linear programming problems have been explored. These other methods have been referred to as "interior point" methods, because they can be characterized as moving through the interior of the solution polyhedron. This is in contrast with the simplex method, in which motion is only along the outer edges of the solution polyhedron.
Several different interior point methods are discussed in detail in Forrest and Tomlin, "Implementing Interior Point Linear Programming Methods in the Optimization Subroutine Library," IBM Systems Journal, Vol. 31, No. 1, 1992. One class of interior point methods, for instance, is called barrier methods. In a barrier method, non-linearities are introduced which prevent the limits from being violated. Then, calculus-based methods are used for the minimization. For example, in a logarithmic barrier method, the problem is to find values for the variables x.sub.j which minimize ##EQU2## subject to: EQU Ax=b (9)
where EQU L.sub.j .ltoreq.x.sub.j .ltoreq.U.sub.j ( 10)
In the above expressions, .mu..fwdarw.0 and .mu.&gt;0. Because .mu. is positive, the logarithm terms become large as the variable approaches the limit. However, since .mu. tends toward zero, the logarithm terms are small, and do not greatly change the result from what it would be if only the cost terms were summed. Since the logarithmic terms of the sum are not defined for values outside the upper and lower limiting values, the bounds cannot be violated.
In essence, solving such a problem involves taking a series of steps from an initially selected solution, each step having a given step length, and each step being taken in a direction, through the solution polyhedron, of reduced cost. The major computational burden of each step of the interior point method is a gradient projection, which is solved as a least-squares problem. The solution of the least-squares problem is a vector change. For the least-squares solution, pi values and reduced costs are updated in a manner similar to that described in connection with the simplex method.
In particular, the least-squares calculation involves multiplication on the right. The matrix product Ax is computed a much greater number of times in interior point methods than in the simplex method.
Considerations that Affect Computational Efficiency
From the foregoing, it will be seen that, whether the simplex method or an interior point method is used to solve a linear programming problem, there will be a great many matrix multiplications involving the coefficient matrix A. Matrix multiplication is generally implemented on a processor having vector processing capability as vector multiplication of rows and/or columns. Multiplications both on the left and on the right are used. Frequently, the A matrix is sparse, i.e., a large number of the terms of the A matrix have the value zero. Thus, computational efficiency is improved by reducing the number of multiplications involving zero terms.
In Forrest and Tomlin, "Vector Processing in Simplex and Interior Methods for Linear Programming", 22 Annals of Operations Research 71-100 (1990)2!, a method is employed for storing coefficients in computer memory for convenient access and efficient execution of multiplication on the left. Columns of the A matrix having large numbers of non-zero terms are stored separately for vectorization. Columns having relatively few non-zero terms are stored in blocks. A second block of memory contains indices, each index corresponding with one of the non-zero terms stored in the block, and indicating which position in the column of the A matrix the term occupied. The cost reduction is then calculated, using a vector processor, by reading a vector of indices from the second block, obtaining a vector of pi values, selecting each pi value for the vector based on the value of the corresponding index. Finally, the scalar product of the pi value vector and the corresponding row of the first block is added to a cost reduction value. This process is repeated for each row of the first block. Because this method involves multiplying rows of the A matrix terms, computational efficiency is achieved by storing the terms in memory row-wise.
Since the most frequently executed matrix multiplication in the simplex method is that on the left, the above data structure works well. However, since the advent of interior point methods, it has been common practice to follow interior point optimization by iterations of the simplex (or a simplex-like) method. For the newer interior point methods, where multiplication on the right (Ax) takes place more frequently, the above data structure is not well suited. Therefore, to accommodate the newer interior point methods, it is desirable to find a way of utilizing the above-described data structure efficiently to vectorize multiplication on the right.