1. Field of the Invention
The present invention relates to a method and apparatus for solving a wide range of numerical problems. More particularly, the invention relates to a method and apparatus that substantially reduces the processing time required to solve numerical problems by using independent processing elements operating in parallel and by exploiting the parallel nature of problems. The invention is a powerful framework for the design and hardware implementation of numerical algorithms, whereby the algorithms can be configured and scaled according to circuit characteristics, such as packing density and switching time, which constantly change following computer technology trends.
2. Prior Art
The use of numerical applications based on floating point calculations in data processing system environments have been a common practice since the early years of computer technology. High speed processing has always been an important target in the development of many methods and apparatuses that provide solutions to numerical problems.
Most of the proposed solutions have been implemented with emphasis on software rather than on hardware, and prior an hardware-based solutions were always proposed to solve specific or low level problems. For example, U.S. Pat. No. 4,477,879 by Wilson T. C. Wong, issued on Oct. 16, 1984, and assigned to Sperry Corporation, entitled Floating Point Processor Architecture Which Performs Square Root by Hardware, is a specific hardware-based solution for the calculation of the square root of floating point numbers. Another example, U.S. Pat. No. 4,849,923 by Samudrala et al, issued on Jul. 18, 1989 and assigned to Digital Equipment Corporation, entitled Apparatus and Method for Execution of Floating Point Operations, is limited to the execution of arithmetic operations on floating point operands. This invention has a low level approach since it focuses only on the execution operations, and it does not propose solutions to high level and more generic numerical problems.
Recently, much attention has been given to parallelism, which is a fine concept used in the development of methods to solve numerical problems. Various approaches of parallelism with complementary points of view have been considered and various methods have been developed.
One approach to parallelism is migration parallelism, where sequential algorithms are adapted to run on parallel architectures. Different steps of a program are executed in parallel. This approach is embedded in applications such as those described in Pseudo Division and Pseudo Multiplication Processes by J. E. Meggit, IBM J. Res. Develop., 6, 210-226 (1962); Cellular Logical Array for Nonrestoring Square-Root Extraction by H. H. Guild, Electron Lett., 6, 66-67 (1970); Cellular Array for Extraction Squares and Square Roots of Binary Numbers by J. C. Majithia, IEEE Trans. Comput., 21, 1023-1024 (1972); and Design of a High Speed Square Root Multiply and Divide Unit by J. H. P. Zurawski and J. B. Gosling, IEEE Trans. Comput., 36, 13-23 (1987). Although this approach focuses on parallelism, it actually implements a conversion of sequential solutions to parallel solutions. Since it is not originally a parallel approach, it is limited to a sequential approach. This limitation reduces the degree of parallelism and the range of feasible processing speed that can be achieved.
Another approach proposed is based on master/slave parallelism. In this case, the problem is partitioned into m sub-problems of the same kind with the division of the decision space into m parts. Each sub-problem is solved and interrelated to the others in a hierarchical fashion. This hierarchical interrelation between sub-problems and the fact that all sub-problems must be necessarily solved in order to reach the solution may demand complex control activities, which may also cause overhead, thus decreasing efficiency on the processing time achieved. Monte Carlo parallelism could be considered as a special case of master/slave parallelism since the decision space is sampled. This is a simple approach of using parallelism to solve numerical problems, but due to its probabilistic nature, it is limited and not 100% reliable. Master/slave parallelism is cited in "The Logic of Computer Arithmetics" by I. Flores, Englewood Cliff, Prentice Hall 1963 and embedded in the problems presented in Cellular Logic Array for Extracting Square Roots by K. J. Dean, Electron. Lett., 4, 314-315 (1968); Economic Pseudo-division Processes for Obtaining Square Roots, Logarithms and Arctan by B. P. Sarkar and E. V. Krishnamutry, IEEE Trans. Comput., 20, 1589-1593 (1971); and Square-Rooting Algorithms for High-Speed Digital Circuits by S. Majerski, IEEE Trans. Comput., 34, 724-733 (1985).
None of these proposed approaches are well suited for global analyses of functions over search intervals, for example, finding all critical points of a function (zeros, minima, maxima, etc.) within an interval. The above mentioned approaches are limited to a local analysis of a specific problem. Generally, the methods and apparatuses of the prior art do not substantially take advantage of computer technology trends such as chip packing density availability, switching time, and size of floating point representations. In addition, prior art methods and apparatuses do not provide great flexibility for the design of hardware-based algorithms.