This invention relates to parallel computing. More specifically, it relates to scaling and managing requests on a massively parallel machine.
Parallel computing is an area of computer technology that has experienced advances. Parallel computing is the simultaneous execution of the same task (split up and specially adapted) on multiple processors in order to obtain results faster. Parallel computing is based on the fact that the process of solving a problem usually can be divided into smaller tasks, which may be carried out simultaneously with some coordination. Parallel computing may be implemented in architectures optimized to execute in a mode of ‘Single Instruction, Multiple Data’ (‘SIMD’) or in a mode of ‘Multiple Instruction, Multiple Data’ (‘MIMD’).
A MIMD machine is a computer in which multiple autonomous processors simultaneously execute different instructions on different data. Distributed systems are generally recognized to be MIMD architectures—either exploiting a single shared memory space or a distributed memory space. Many common computer applications are implemented with MIMD architectures, including, for example, most accounting programs, word processors, spreadsheets, database managers, browsers, web applications, other data communications programs, and so on.
A SIMD machine is a computer that exploits multiple data streams against a single instruction stream to perform operations which may be naturally parallelized. SIMD machines are ubiquitous on a small scale, in digital speech processors, graphics processors, and the like. SIMD machines execute parallel algorithms, typically including collective operations. A parallel algorithm can be split up to be executed a piece at a time on many different processing devices, and then put back together again at the end to get a data processing result. Some algorithms are easy to divide up into pieces. For example, the job of checking all of the numbers from one to a hundred thousand to see which are primes could be done, by assigning a subset of the numbers to each available processor, and then putting the list of positive results back together. In this specification, the multiple processing devices that execute the individual pieces of a parallel program are referred to as ‘compute nodes.’ A SIMD machine is composed of compute nodes and other processing nodes as well, including, for example, input/output (I/O) nodes, and service nodes.
Parallel algorithms are designed also to optimize the data communications requirements among the nodes of a SIMD machine. There are two ways parallel processors communicate: shared memory or message passing. Shared memory processing needs additional locking technology for the data and imposes the overhead of additional processor and bus cycles and also serializes some portions of the algorithm. Message passing uses high-speed data communications networks and message buffers, but this communication adds transfer overhead on the data communications networks as well as additional memory need for message buffers and latency in the data communications among nodes. Designs of SIMD machines use specially designed data communications links so that the communication overhead will be small, but it is the parallel algorithm that decides the volume of the traffic. It is possible to partition the machine into sets of compute nodes such that neighboring partitions are electrically isolated from each other. This allows multiple message passing interface (MPI) type jobs to execute concurrently.