1. Field of the Invention
The present invention relates generally to data transmission, and more particularly to a packet lockstep mechanism for ensuring reliable data packet throughput in a redundant system.
2. Description of the Prior Art
With early networked storage systems, files are made available to the network by attaching storage devices to a server, which is sometimes referred to as Direct Attached Storage (DAS). In such a configuration, the server controls and “owns” all of the data on its attached storage devices. A shortcoming of a DAS system is that when the server is off-line or not functioning properly, its storage capability and its associated files are unavailable.
At least the aforementioned shortcoming in DAS systems led to Network Attached Storage (NAS) technology and associated systems, in which the storage devices and their associated NAS server are configured on the “front-end” network between an end user and the DAS servers. Thus, the storage availability is independent of a particular DAS server availability and the storage is available whenever the network is on-line and functioning properly. A NAS system typically shares the Local Area Network (LAN) bandwidth, therefore a disadvantage of a NAS system is the increased network traffic and potential bottlenecks surrounding the NAS server and storage devices.
At least the aforementioned shortcoming in NAS systems led to Storage Area Networking (SAN) technology and associated systems. In SAN systems, storage devices are typically connected to the DAS servers through a separate “back-end” network switch fabric (i.e., the combination of switching hardware and software that control the switching paths).
FIG. 1 shows a block diagram of a Storage Area Network 10 of the prior art connected to a client 11 through a wide area network (WAN) such as the Internet 12, or a local area network (LAN) such as might be implemented within an enterprise. The SAN 10 includes an IP router 14, an IP switch 16, a plurality of servers 18, 20, 22, and different storage media represented as Redundant Arrays of Inexpensive Disks (RAID) 26, Just a Bunch of Disks (JBOD) 28, 30, and tape back-up 32, connected to the separate “back-end” network switch fabric 34 described above.
The deployment of prior SAN technologies in the growing enterprise-class computing and storage environment has created several challenges. One such challenge is to provide a scalable system in which thousands of storage devices can be interconnected. One solution has been to cascade together a multitude (tens to hundreds) of small SAN switches, however, a packet switched through such a system typically must make numerous hops before reaching a destination port. Performance (e.g., latency and bandwidth) and reliability decline in such systems. Additionally, systems including hundreds of interconnected switches are inherently difficult to manage and to diagnose for faults, both from hardware and software perspectives. Further still, since no SAN protocol is truly ubiquitous enough to be readily integrated with other networking architectures in a heterogeneous SAN environment, bridges and conversion equipment are required, increasing the costs to build and maintain such a system.
Another such challenge is to provide a system that is fault tolerant. A fault tolerant system is one that is capable of continuous service despite a component fault or failure, and additionally capable of continuous service through any subsequent repair. To this end, a hardware fault is typically detected and circumvented by switching to a redundant component.
A common redundancy scheme in SAN networking is to use two physical connections or ports with different addresses to feed two separate logical paths. In such a system, either of the redundant components may be active at a time, or a load sharing and balancing technique may be implemented. A popular method for monitoring redundant components is a “heartbeat” scheme in which a first processor, card, or other component systematically communicates with a second redundant processor, card, or component in order to verify the second component's operational state, or heartbeat. Lack of a reply from the second component is taken as an indication that the second component is faulty.
Some fault tolerant systems operate the redundant components in lockstep, meaning that the redundant components concurrently perform identical operations. Thus, in the event of a component failure, an application can continue to execute by using the output from the corresponding redundant component, and the failure will therefore be transparent to an end user. Lockstepping can therefore provide zero loss of data integrity upon a single component fault or failure.
In addition to lockstepping at the component level, lockstepping is sometimes also utilized at the processor level and at the circuit board level to provide a level of fault tolerance to entire system architectures. One lockstepping method for attaining fault tolerance at the processor level includes employing a fault tolerant core that is absolutely trusted. The core is commonly a third processor used to verify the integrity of duplicate systems by checking for errors. Each of the processing sets included in such a core typically are configured with a processor module, caches, RAM for that processing set, and PCI buses.
The processing sets are driven by synchronized clocks to execute both sets in lockstep. Since the clocks are locked, each transistor in a correctly functioning processor set will perform the same transaction as its sibling transistor on the sibling processor set, on a nanosecond by nanosecond basis. A transaction is taken from each processing set by a PCI bridge and the two transactions are compared. As described above with respect to lockstepped components, processors operating in lockstep have their transactions compared at each and every clock cycle. If the transactions are identical when compared, the transaction is passed to the physical PCI bus. However, if the transactions are not identical an exception handler typically identifies the faulty processing set.
Other lockstepping schemes compare processor transactions less frequently than at each clock cycle, for example, at the memory transaction level, at the bus level, and/or at the input/output level. Such schemes can be unpredictable as a result of component or card unpredictability. For example, small variations in the temperatures of identical components in identical processors running in parallel can trigger processor interrupts differently, causing the parallel processors to fall out of lockstep.
A functional lockstep arrangement for redundant processors, described in U.S. Pat. No. 5,226,152 to Klug et al., compares processor transactions at each write operation. Klug et al. teach that asynchronous inputs to redundant processors will generally fail in a clock lockstep mode. According to Klug et al., an asynchronous input signal can change during a sampling time, and there is a certain probability that the change will only be seen by one of the two processors, thus a comparison between the two processors would show a discrepancy, or failure, even though both processors were functioning properly.
Klug et al. further teach that when an input signal properly causes a processor interrupt to occur, it is possible for one processor to respond to the interrupt and begin execution of an interrupt service routine even though a parallel processor will not see the same signal until the next clock cycle. Hence, the two processors will fall out of clock lockstep, again, in the absence of a hardware fault.
In view of the preceding scenarios, it will be appreciated that the processor lockstepping schemes can fall out of lockstep even though all of the hardware is functioning correctly. Circuit board lockstepping schemes are similarly problematic. For example, unpredictable minute variations between circuit boards operating in parallel make it difficult to maintain a lockstep relationship between them.
In light of the aforementioned challenges with respect to implementing fault tolerant systems, a lockstep circuit for packet processing is required that will provide error checking and redundancy without generating faults in response to asynchronous data.