1. Field of the Invention
The invention relates to computer systems. Specifically, the invention relates to apparatus, systems, and methods for gathering trace data indicative of resource activity on a computer system.
2. Description of the Related Art
Computer and information technology continues to progress and grow in its capabilities and complexity. In particular, software applications have evolved from single monolithic programs to many hundreds or thousands of object-oriented components that can execute on a single machine or distributed across many computer systems on a network.
Computer software and its associated data is generally stored in persistent storage organized according to some format such as a file. Generally, the file is stored in persistent storage such as a Direct Access Storage Device (DASD, i.e., a number of hard drives). Even large database management systems employ some form of files to store the data and potentially the object code for executing the database management system.
Business owners, executives, managers, administrators, and the like concentrate on providing products and/or services in a cost-effective and efficient manner. These business executives recognize the efficiency and advantages software applications can provide. Consequently, business people factor in the business software applications in long range planning and policy making to ensure that the business remains competitive in the market place.
Instead of concerning themselves with details such as the architecture and files defining a software application, business people are concerned with business processes. Business processes are internal and external services provided by the business. More and more of these business processes are provided at least in part by one or more software applications. One example of a business process is internal communication among employees. Often this business process is implemented largely by an email software application. The email software application may include a plurality of separate executable software components such as clients, a server, a Database Management System (DBMS), and the like.
Generally, business people manage and lead most effectively when they focus on business processes instead of working with confusing and complicated details about how a business process is implemented. Unfortunately, the relationship between a business process policy and its implementation is often undefined, particularly in large corporations. Consequently, the affects of the business policy must be researched and explained so that the burden imposed by the business process policy can be accurately compared against the expected benefit. This may mean that computer systems, files, and services affected by the business policy must be identified.
FIG. 1 illustrates a conventional system 100 for implementing a business process. The business process may be any business process. Examples of business processes that rely heavily on software applications include an automated telephone and/or Internet retail sales system (web storefront), an email system, an inventory control system, an assembly line control system, and the like.
Generally, a business process is simple and clearly defined. Often, however, the business process is implemented using a variety of cooperating software applications comprising various executable files, data files, clients, servers, agents, daemons/services, and the like from a variety of vendors. These software applications are generally distributed across multiple computer platforms.
In the example system 100, an E-commerce website is illustrated with components executing on a client 102, a web server 104, an application server 106, and a DBMS 108. To meet system 100 requirements, developers write a servlet 110 and applet 112 provided by the web server 104, one or more business objects 114 on the application server 106, and one or more database tables 116 in the DBMS 108. These separate software components interact to provide the E-commerce website.
As mentioned above, each software component originates from, or uses, one or more files 118 that store executable object code. Similarly, data files 120 store data used by the software components. The data files 120 may store configuration settings, user data, system data, database rows and columns, or the like.
Together, these files 118, 120 constitute resources required to implement the business process. In addition, resources may include Graphical User Interface (GUI) icons and graphics, static web pages, web services, web servers, general servers, and other resources accessible on other computer systems (networked or independent) using Uniform Resource Locators (URLs) or other addressing methods. Collectively, all of these various resources are required in order to implement all aspects of the business process. As used herein, “resource(s)” refers to all files containing object code or data as well as software modules used by the one or more software applications and components to perform the functions of the business process.
Generally, each of the files 118, 120 is stored on a storage device 122a-c identified by either a physical or virtual device or volume. The files 118, 120 are managed by separate file systems (FS) 124a-c corresponding to each of the platforms 104, 106, 108.
Suppose a business manager wants to implement a business level policy 126 regarding the E-commerce website. The policy 126 may simply state: “Backup the E-commerce site once a week.” Of course, other business level policies may also be implemented with regard to the E-commerce website. For example, a load balancing policy, a software migration policy, a software upgrade policy, and other similar business policies can be defined for the business process at the business process level.
Such business level policies are clear and concise. However, implementing the policies can be very labor intensive, error prone, and difficult. Generally, there are two approaches for implementing the backup policy 126. The first is to backup all the data on each device or volume 122a-c. However, such an approach backs up files unrelated to the particular business process when the device 122a-c is shared among a plurality of business processes. Certain other business policies may require more frequent backups for other files on the volume 122a-c related to other business processes. Consequently, the policies conflict and may result in wasted backup storage space and/or duplicate backup data. In addition, the time required to perform a full copy of the devices 122a-c may interfere with other business processes and unnecessarily prolong the process.
The second approach is to identify which files on the devices 122a-c are used by, affiliated with, or otherwise comprise the business process. Unfortunately, there is not an automatic process for determining what all the resources are that are used by the business process, especially business processes that are distributed across multiple systems. Certain logical rules can be defined to assist in this manual process. But, these rules are often rigid and limited in their ability to accurately identify all the resources. For example, such rules will likely miss references to a file on a remote server by a URL during execution of an infrequent feature of the business process. Alternatively, devices 122a-c may be dedicated to software and data files for a particular process. This approach, however, may result in wasted unused space on the devices 122a-c and may be unworkable in a distributed system.
Generally, a computer system administrator must interpret the business level policy 126 and determine which files 118, 120 must be included to implement the policy 126. The administrator may browse the various file systems 124a-c, consult user manuals, search registry databases, and rely on his/her own experience and knowledge to generate a list of the appropriate files 118, 120.
In FIG. 1, one implementation 128 illustrates the results of this manual, labor-intensive, and tedious process. Such a process is very costly due to the time required not only to create the list originally, but also to continually maintain the list as various software components of the business process are upgraded and modified. In addition, the manual process is susceptible to human error. The administrator may unintentionally omit certain files 118, 120.
The implementation 128 includes both object code files 118 (i.e., e-commerce.exe. Also referred to as executables) and data files 120 (i.e., e-comdata1.db). However, due to the manual nature of the process and storage space concerns, efforts may be concentrated on the data files 120 and data specific resources. The data files 120 may be further limited to strictly critical data files 120 such as database files. Consequently, other important files, such as executables and user configuration and system-specific setting files, may not be included in the implementation 128. Alternatively, user data, such as word processing documents, may also be missed because the data is stored in an unknown or unpredictable location on the devices 122a-c. 
Other solutions for grouping resources used by a business process have limitations. One solution is for each software application that is installed to report to a central repository which resources the application uses. However, this places the burden of tracking and listing the resources on the developers who write and maintain the software applications. Again, the developers may accidentally exclude certain files. In addition, such reporting is generally done only during the installation. Consequently, data files created after that time may be stored in unpredictable locations on a device 122a-c. 
Information regarding activities and computing operations performed by resources of a computer processing system is useful for identifying a business process as well as other purposes. For example, it is desirable to determine comprehensively what short term and long term (historical) activities have been conducted on a target computer system. In addition, it is desirable to know which resources conducted certain types of resource activities. Unfortunately, conventional systems and methods fail to provide these benefits.
From the foregoing discussion, it should be apparent that a need exists for an apparatus, system, and method that gathers trace data indicative of resource activity. Beneficially, such an apparatus, system, and method would gather the trace data automatically and make the trace data accessible to other software processes by way of an Application Programming Interface and/or use of a standard data exchange format for the trace data. Furthermore, the apparatus, system, and method would allow for modular gathering of trace data such that different types of trace data may be gathered without significantly altering the apparatus, system, and/or method.