In storage area networks, routers process commands between hosts and target devices. Typically, the hosts are connected to the router by a first data transport medium (e.g., fibre channel transport medium) and the target devices are connected to the hosts by a second data transport medium (e.g., a SCSI transport medium). This presents problems in being able to understand the flow of information through the router as the manner in which information is transported changes. Because the flow of information in the router is not well understood, it is often difficult to assess the cause of an error or other issue, particularly when the issue only manifests itself under heavy loads with data flowing to multiple hosts/devices at a time.
Typically, in prior systems, insufficient information is maintained to track a command through a router. In order to determine the source of an error associated with a command, the conditions under which the error occurred (including hardware, software and load conditions) must be replicated. The router must then be monitored to determine the cause of the error, assuming the error reoccurs in the replicated conditions. Replication of the conditions under which an error occurred can be time consuming and costly.
To the extent that information that can aid in diagnosing errors or other issues associated with commands is maintained, it is typically kept in a chronological log that records each status message as it occurs. When the list becomes too long, the oldest entries in the list are overwritten. In attempting to resolve an issue, a system administrator must typically print out the list to determine when errors occurred. However, it is often difficult to associate status messages with particular commands from such a list or to determine if an error is the result of issues at the host-side or target-side of a device.