The amount of data that has to be processed to manage the operations of many enterprises, such as large-scale e-retail enterprises, has grown tremendously in recent years, especially as the customer bases of the enterprises have expanded to include millions of users spread across numerous countries and time zones, and as the product catalogs of the enterprises have expanded to include millions of objects. For some enterprises, for example, hundreds of gigabytes of operational data such as changes to inventory, pricing, sales results, supply chain status, expected demands, marketing campaigns, and the like may be generated and analyzed every day.
Software applications generated to automate various aspects of business operations sometimes represent business processes as workflows of jobs, with dependencies defined between various jobs. A given type of business process or application may, for example, be represented as a graph whose nodes represent respective jobs or tasks, and whose edges represent logical or data dependencies among the jobs or tasks. For example, warehousing-related applications of a large-scale retailer may involve the scheduling and execution of hundreds or thousands of workflow instances each day, with individual ones of the workflow instances comprising dozens or hundreds of jobs corresponding to respective workflow definitions or graphs. The jobs, as well as input data sets generated by diverse sources in different formats, and output data sets which may have varying formats depending on their destinations, may sometimes be stored as objects within databases or other types of storage repositories.
Many of the costs associated with data processing have fallen substantially in recent years, and such cost reductions may allow larger amounts of data associated with complex business workflows to be stored and analyzed relatively cheaply, e.g., using cloud based computing and storage services. Nevertheless, in at least some large scale workflow management systems, generating efficient responses to job status queries may represent a non-trivial technical challenge, due at least in part on the complexity of inter-job dependencies. In addition, the complexity and time-dependent nature of the dependencies may sometimes result in hard-to-diagnose errors in implementing the workflows.
While embodiments are described herein by way of example for several embodiments and illustrative drawings, those skilled in the art will recognize that embodiments are not limited to the embodiments or drawings described. It should be understood, that the drawings and detailed description thereto are not intended to limit embodiments to the particular form disclosed, but on the contrary, the intention is to cover all modifications, equivalents and alternatives falling within the spirit and scope as defined by the appended claims. The headings used herein are for organizational purposes only and are not meant to be used to limit the scope of the description or the claims. As used throughout this application, the word “may” is used in a permissive sense (i.e., meaning having the potential to), rather than the mandatory sense (i.e., meaning must). Similarly, the words “include,” “including,” and “includes” mean including, but not limited to. When used in the claims, the term “or” is used as an inclusive or and not as an exclusive or. For example, the phrase “at least one of x, y, or z” means any one of x, y, and z, as well as any combination thereof.