1. Technical Field
The present invention relates in general to the field of distributed computer networks and in particular to a method and apparatus for remote maintenance and error recovery within distributed data processing networks. Still more particularly, the present invention relates to a method and apparatus which permits network stations to automatically invoke a selected application or procedure in response to an error message or reconfiguration message from the network.
2. Description of the Related Art
Distributed data processing networks are well known in the art. In fact, in selected applications, networks of terminals or so-called "personal" computers are rapidly supplanting large concentrated data processing systems due to the ease of updating, modifying and relocating such systems. Examples of various types of networks in use today include: bus topologies; ring topologies; star topologies; and, tree topologies. Each topology utilizes a distinctive manner to interconnect all stations within the network.
Additionally, many different protocols exist for formatting data which is exchanged between stations within a network. For example, token ring networks allow unidirectional data transmission between data stations by a token passing procedure over one transmission medium so that transmitted data returns to the transmitting station. IEEE Standard 802.5 was approved for token ring networks in December of 1984.
Many other network protocols exist including: Advanced Peer-to-Peer Networks (APPN); Ethernet Networks (as set forth in IEEE Standard 802.3); Token Bus Network (IEEE Standard 802.4); and others. Many of these networks are so-called "self-healing" networks in that the network system possess the ability to reconfigure the network in response to an error condition. For example, within token ring networks, one station, referred to as the "active monitor" provides token monitoring and other functions. The active monitor station resolves lost tokens, frames which circle the ring more than once, clocking, and the presence of other active monitors on the ring. A timer within the active monitor station is typically utilized to detect a lost token by timing the period of time required for the largest possible frame of data to circle the ring. If this time is exceeded, the active monitor assumes the token was lost, purges the ring and initiates a new token.
Certain errors within such networks are often referred to as "hard" errors. Hard errors are permanent faults, usually in equipment, which cause a ring to cease operation within normal architecture protocols. A ring station which is downstream from a hard error will recognize such an error at its receiver input and begin transmitting beacon frames, paced at predetermined intervals, with an all station address. This continues until the fault is restored. Fault restoration generally occurs by having each station perform a test in response to an error signal and then reattach itself to the ring only if the test is successful.
Thus, an error signal may be utilized to automatically reconfigure a ring network by removing a station which has caused the fault condition. Additionally, a station may be removed by utilizing an explicit command, such as "Auto Removal" which is utilized in an IBM Token-Ring Network to remove a device, such as an attached personal computer, from the data passing activity without human intervention by means of a token-ring adapter card. This technique, as well as other low level commands, is accomplished utilizing the so-called "Logical Link Control" (LLC) standard which is defined by IEEE Standard 802.2.
While these methods described above permit such networks to practice "self-healing" the end result may not be acceptable. For example, certain errors may result in the reconfiguration of a network which deletes a "bridge" or "gateway" station which may be necessary for connectivity within the larger network. In such situations a bridge or gateway station must then be manually reset to reestablish communication between the two rings or a ring in a host system. Additionally, certain network management facilities may suspend a station, requiring that station to undergo a manual Initial Program Load (IPL) to return the station to full communication.
Thus, it should be apparent that a need exists for a method and system which will permit remote maintenance and error recovery in distributed data processing networks.