1. Field of Art
The disclosure generally relates to the field of cluster computing in particular to the mapping of a logical data flow to a directed acyclic graph representative of a cluster computer system.
2. Description of the Related Art
Cluster computing includes multiple self-contained computing devices working sequentially or in parallel as a unified system. The computing devices interact with each other via a network to solve a common problem that can be divided into many smaller, manageable problems. For example, cluster computing systems are currently employed to process big data problem sets relating to healthcare, genomics, or social networking services. Logical data flow processing is well-suited for defining these problems in a cluster computing environment. In logical data flow processing, a computing problem may be represented by a directed acyclic graph having one or more sets of nodes connected by a set of edges. However, because cluster computing systems are comprised of multiple stand-alone computing devices, the number of computing resources and configurations between them may vary significantly from cluster to cluster.
Cluster computing environments comprise a wide array of computing configurations with a wide array of programming tools. This environment presents a challenge for designers seeking to implement a write-once repetitive use application that is portable across different cluster computing systems. In a single cluster computing system, there may be several pathways for executing logic. Programming tools available aren't abstract enough to account for all possible permutations of resources in cluster computing systems. Additionally, programming languages cannot determine structural compatibility or isolate structural components of cluster computing systems. Current programming tools present challenges for implementing identical functionality reliably across different cluster platforms.