Massively parallel processing (MPP) is a coordinated processing of a program by multiple processors working on different parts of a program. Each processor may have its own operating system and memory. The use of MPP speeds the performance of huge databases that deal with massive amounts of data. A MPP database (MPP DB) can use multi-core processors, multiple processors and servers, and/or storage appliances equipped for parallel processing. That combination enables reading and processing many pieces of data across many processing units at the same time for enhanced speed.
MPP DB systems can include a large number of processors, servers, or other DB devices that are operated and regulated by a management system or systems. In order to simplify MPP DB management, the DB devices can be grouped into clusters, where an individual cluster can include an interface for communicating with the management system. A cluster can operate semi-autonomously.
If a failure occurs to one or more MPP DB devices, the quality of service will be affected. A cluster may include thousands of servers. The failure rate becomes higher as clusters grow larger. Server failures happen frequently, and may even reach once per week or once per day for a large cluster.
In MPP DB, data can be partitioned across multiple servers or nodes, with each server or node having memory and/or processors to locally process the data. All communication between DB devices/systems is via network interconnections. A MPP DB scales out the processing of a large job by distributing data to multiple servers, and running individual transaction portions in the multiple servers.