A distributed computing system is a group of processing units that work together to present a unified system to a user. Distributed computing systems are usually deployed to improve the speed and/or availability of computing services over that provided by a single processing unit alone. Alternatively, distributed computing systems can be used to achieve desired levels of speed and availability within cost constraints.
Distributed systems can be generally described in terms of how they are designed to take advantage of specialization, redundancy, isolation, and parallelism. Specialization takes advantage of the separation of tasks within a system. Tasks can be done faster with processing units dedicated and specialized for those tasks. Redundancy is the opposite side of specialization—it refers to having multiple comparable processing units available for work. If there is a problem with any particular processing unit, other units can be brought in to handle the requests which would have gone to the problem unit. The resulting reliability is generally referred to as “high availability”—and a distributed system designed to achieve this goal is a high availability system. Isolation is related to redundancy. Part of the reason distributed systems use redundancy to achieve high availability is because each processing unit can be isolated from the larger system. Finally, parallelism is a characteristic of the computing tasks done by distributed systems. Work that can be split up into many independent subtasks is described as highly parallel or parallelizable. It is possible to use the different processing units in a distributed system to work on different parts of the same overall task simultaneously, yielding an overall faster result.
The term “cloud computing” is frequently used to describe the trend toward using networked distributed systems such as the ones described above to perform computing tasks, with customers charged according to their specific use of bandwidth, disk space, CPU time, and other cloud resources. The companies providing these cloud computing architectures are sometimes called Internet Service Providers, Application Service Providers, or Cloud Computing Providers. As customers become more sophisticated, however, there is a need for instrumented distributed architectures that can provide real-time reports relative to various metrics across an entire distributed platform.