1. Field of the Invention
The present invention relates generally to network communications, and more particularly to automating a process for monitoring data processing jobs that are part of a Service Level Agreement (SLA) to ensure prompt completion of SLA jobstreams.
2. Related Art
Business enterprises using extensive data processing platforms commonly employ Production Operations personnel to ensure the successful initiation and execution of data processing jobs. The data processing jobs are computer programs that may run on mainframe or midrange computers, and perform business-critical tasks, such as billing and order entry. Jobs may be on-line jobs that run in real-time, or they may be batch jobs that run according to a schedule or other dependency. Often a business process is performed by a job stream, and thus, many jobs are dependent on the successful completion of previous jobs. A job stream is time-sensitive, and must run according to a strict schedule.
Service Level Agreements (SLAs) are contracts used between Production Operations and the client organizations that are responsible for each application. SLAs specify certain parameters of job performance and execution, such as start/end times, to ensure that job streams are executed successfully. It is critical that Production Operations closely monitor job execution to ensure that the critical path of each job stream is executed according to its SLA. If a job is late in starting, slow to execute, or abnormally terminates, it is important for Production Operations to determine this and analyze any impacts to ensure that downstream dependencies are accounted for and impacts are minimized.
Currently, the process of monitoring job execution, comparing against SLAs, and managing downstream jobs in a critical path is a very effort-intensive, manual process. Production Operations personnel must go through paper logs of SLA specifications and manually compare them with job performance data. By the time the Production Operations personnel notice a job ABENDed or started late, the impact to downstream jobs may have already made the SLA impossible to meet. If a job abnormally terminates and must be restarted, the impact to downstream jobs must be quickly analyzed so that appropriate actions, such as starting a job earlier than scheduled, may be taken. What is needed is an automated process to monitor job execution while comparing each job against its SLA in order to manage downstream jobs; related to the SLA accordingly.