Businesses in today's global economy are increasingly called upon to implement relatively complex processing systems in order to efficiently and accurately manage increasing amounts of data and information, from both internal and external sources, using constantly-evolving information technology (IT) infrastructure. Some IT systems become so complex that it becomes difficult, at best, to determine how or how well business processes are being implemented. As a result, it is similarly difficult to determine if such IT systems are properly aligned with the needs of the business. These problems are exacerbated where heterogeneous systems (i.e., systems from different vendors and/or not designed to operate together) are linked together, e.g., human resource information systems communicating with separate accounting payroll systems.
An example of this is illustrated in FIG. 1 where a fairly typical system 100 is illustrated. In particular, the system 100 comprises a plurality of remote users 102, often using client software, communicating with a typical business IT system 103 via a network 104. As shown, the IT system 103 comprises a proxy server 108 sitting behind a first firewall 106, and a web server 112 sitting behind a second firewall 110. Various application servers 116, 126, sitting behind yet another firewall 114, communicate with proprietary databases 128, 130, or with various legacy systems 120-124 through appropriate interface software 118. Given that each of the computing devices illustrated in FIG. 1 (i.e., user 102 devices; firewalls 106, 110, 114; servers 108, 112, 116, 126; databases 128, 130; interface software 118; legacy systems 120-124) may comprise one or more software applications involved in processing data within the IT system 103, it becomes remarkably complex to determine exactly how any given piece of data is processed, much less whether such processing is being carried out in an optimal manner. Although the examples described hereinabove have been restricted to business IT systems, those of skill in the art will appreciate that the problem of IT system complexity and the attendant difficulties in analyzing them are not restricted to the domain of business, and in fact may be found in a variety of entities/organizations.
Prior art techniques have failed to adequately address the need to develop understanding of deployed (i.e., installed and operational) processes, sometime referred to as “process discovery”, particularly in any sort of automated fashion. A commonly employed technique currently is to manually reverse engineer each component of a business process, particularly those that are implemented using software applications. Where software is used, this may require analysts to review source code, if available, or reconstruct such source code to understand the particular functions implemented by the software application. Not surprisingly, this is a time-consuming and expensive process that is prone to error.
So-called Application Response Measurement (ARM) techniques have been developed that allow analysts to measure the performance (i.e., response time) of deployed software applications. Using these techniques and corresponding suites of tools, analysts are able to determine how quickly data is processed, but are unable to develop any understanding of how the software under test is particularly implemented, i.e., the internal configuration of the software. As a result, it may be difficult, if not impossible, to determine whether the process under consideration is sub-optimal in any fashion.
More recently, researchers at Eindhoven University of Technology have developed techniques for so-called “process mining” in order to develop models of existing processes. In particular, process logs are developed by obtaining couplets consisting of “case identifications” (i.e., identifications of particular data elements being processed) and corresponding “task identifications” (i.e., identifications of particular portions of the overall process operating upon a given data element) reported by a process. By analyzing such processing logs, sequences of tasks that have been purposefully instrumented (i.e., modified to report the desired couplet information) can be identified. However, the value of this technique necessarily depends on the ability of the test designers to correctly identify the appropriate tasks for instrumentation. As it is currently understood, it does not appear that this technique has the capability to discover parent/child processes that have not already been identified during the instrumentation phase.
Thus, a need exists for techniques that allow for the analysis of process flows, preferably in an automated manner, that overcome the limitations of prior art techniques.