The present invention relates generally to computer systems, and more specifically, to real-time analytics of machine generated instrumentation data.
System Management Facility (SMF) is a component of IBM's z/OS® for mainframe computers that provides a standardized method for writing out records of activity to a file, or dataset. SMF provides instrumentation of baseline activities running on an IBM mainframe operating system, including activities such as input/output (I/O), network activity, software usage, error conditions, and processor utilization. SMF forms the basis for many monitoring and automation utilities.
Apache Spark™ is an open source computing framework that provides programmers with an application programming interface (API) centered on a data structure called the resilient distributed dataset (RDD), a read-only multiset of data items distributed over a cluster of machines that is maintained in a fault-tolerant way. The availability of RDDs facilitates the implementation of both iterative algorithms that visit their dataset multiple times in a loop, and interactive/exploratory data analysis (i.e., the repeated database-style querying of data). Apache Spark is implemented using a cluster manager and a distributed storage system.