The present invention relates to a system and method for rapidly identifying the source regions for errors that may occur in a storage area network (SAN). Such isolation of faults or errors presents a challenge to network administration, particularly in networks that may include hundreds or even thousands of devices, and may have extremely long links (up to 10 kilometers) between devices.
In systems currently in use, when a link in a network fails or when a device causes an error, it is conventional to try to reproduce the event, such as a read or write command, that caused the error. There is a substantial amount of trial and error involved in trying to isolate fault regions in this way, which is very expensive in time and resources, especially when a large number of components is involved.
As SANs become larger and longer, especially with the use of very long fibre optic cables, it becomes more urgent that a fast and deterministic method and system be developed so that isolating errors that occur in these larger systems does not become prohibitively expensive or time-consuming.
It is particularly desirable that such a system be provided that scales efficiently as a network increases in size, preferably with minimal alteration to the fault isolation system or the network itself.