Distributed computing is required in many application domains. For example, computational science, management of distributed database systems, management of distributed computing systems for bio-informatics, image analysis, or other applications in which very large amounts of data are to be processed or huge amounts of computational resources are required. However, orchestration of complex data flows of substantial amounts of data from live (streaming) sources is difficult with existing approaches.
The use of distributed computing systems is becoming more widespread. Often computational processes are decomposed into multiple subprocesses which are executed on different computing systems or computational processes are divided into fragments and the fragments are spread over multiple systems to be computed. Management of these distributed computing systems is typically carried out by a single entity which “owns” the computational process and there is a general need to simplify and improve the manner in which the management is achieved. For example, existing tools to manage scientific workflows enable a scientist to make use of remote computing resources to carry out in-silico experiments. However, it is difficult to enable the experiments to be managed by multiple users working on different computers at the same time in a simple and effective manner. In addition, existing approaches are often unsuited to novice users who have little or no knowledge of remote computing resources that may be used.
There is an increasing need to harness the cumulative power of multiple computing devices owned by a single person (e.g. an individual's laptop, office desktop computer and home computer) or to harness the power of grid and cloud computing. However, current systems do not enable this to be achieved in a simple to use and effective manner. As a result it is difficult to engage in collaborative design, development and review of scientific or technical computing projects. In addition, the results of any such collaborative computations are difficult to visualize and use by executives, policy-makers and other users.
The embodiments described below are not limited to implementations which solve any or all of the disadvantages of known distributed computing systems.