This invention relates generally to storage systems associated with computer systems and more particularly to providing a method and apparatus for monitoring host initiated events which are related to a particular storage system.
As it is known in the art, computer systems generally include a central processing unit, a memory subsystem and a storage subsystem. According to a networked or enterprise model of a computer system, the storage subsystem associated with or in addition to a local computer system, may include a large number of independent storage devices or disks housed in a single enclosure. This array of storage devices is typically connected to several computers (or hosts) via dedicated cabling or via a network. Such a model allows for the centralization of data which is to be shared among many users and also allows a single point of maintenance for the storage functions associated with the many computer systems.
One type of storage system known in the art is one which includes a number of disk storage devices configured as an array (sometimes referred to as RAID). Such a system may include several arrays of storage devices. In addition to the arrays of storage devices, typical storage systems include several types of controllers for controlling the various aspects of the data transfers associated with the storage system. One type of controller is a host controller and provides the interface between the host computers and the storage system. Another type of controller is a disk controller. There may be one or more disk controllers for each array of storage devices in a storage system. The disk controller manages the transfer of data to and from its associated array drives.
In addition to the controllers described above, advanced storage systems, such as the SYMMETRIX.RTM. storage systems manufactured by EMC Corporation, may include a very large memory which is coupled to each of the controllers in the system. The memory may be used as a staging area (or cache) for the data transfers between the storage devices and the host computers and may provide a communications path between the various controllers. Such systems provide superior performance to non-cache storage systems. The price paid for this superior performance is complexity. Since there may be many controllers and storage devices in a system, with data moving between several controllers of the system, there is the potential for operational errors which may be difficult to diagnose.
In order to properly diagnose a problem reported by a user of the above described storage system, it would be helpful to the person performing the diagnosis to be able to examine the events which preceded the problem. However, present storage systems are unable to provide a host user with a history of host initiated events and their corresponding storage system resultant events. Thus a specialized diagnostic procedure would typically be needed to pinpoint any system failures.
The specialized diagnostic procedure might require a site visit by the service personnel so that they could access information about the storage system from a service computer connected directly to the storage system. The information gathered might be a trace record of the events which occurred over a certain period of time within the storage system. However in order to be useful, this information would need to be correlated with any events which occurred at the host computer. Unfortunately host trace information (if any exists at all) is typically routed to a host display or host log file.
Until now, there has been no way to interleave and log the host events with their corresponding storage system events. Before the present invention there would be two separate histories on two different machines. Since the clock resolutions of the host and storage system may be different, the time stamp information associated with each trace file is not easily correlated. The service personnel is thus faced with the formidable task of trying to correlate the information stored in the individual files. It would be advantageous therefore to provide a means for allowing the host computers connected to a storage system to track the history of events initiated by the host along with the resultant events occurring on the storage system.