1. Field of the Invention
The invention pertains to distributed computer networks, particularly to enhancing the reliability and functionality of network management functions in such networks by providing an inter-process message transmission method which guarantees that a message originated by a node will be received by at least one other node.
2. Description of the Prior Art
The invention is embodied in an EFTPOS (Electronic Funds Transfer/Point of Sale) system such as the one described in U.S. Pat. No. 4,879,716, "Resilient Data Communications System", issued Nov. 7, 1989 to McNally et al (hereinafter, "the McNally patent").
A large number of point-of-sale (POS) terminals are distributed over a very large geographical area, perhaps on the order of an entire continent. A communications network is provided which transports data over the entire geographical area, and all the POS terminals are connected to it, through telephone lines and intelligent line concentrators (called network access controllers or "NACs"). Also connected to the communications network are computers operated by financial institutions.
The POS terminals are typically placed into service by merchants, who then accept transactions from consumers who carry plastic credit cards or debit cards which bear in machine-readable form an identification of a financial institution which maintains an account for the consumer, and an identification of that account. The primary function of the system is to forward from the POS terminals to the financial institution computers information identifying a consumer's account and a transaction the consumer wishes to make in that account, and to return from the financial institution to the POS terminal either an acceptance or rejection of that transaction.
A merchant wishing to place a POS terminal into service typically obtains the necessary equipment (the terminals and associated modems, etc.) from a "service provider" organization. Such an organization might have no role in the EFTPOS system beyond that of providing equipment, or larger merchants and financial institutions might function as service providers; in that case the latter role is kept separated from the former.
In addition to line concentrators for POS terminals and computers of financial institutions being connected to the communications network as described above, two other classes of equipment are connected to it which exist ancillarily to the system's aforementioned primary function: network management systems (NMSs), and management workstations (WSs). (WSs are not specifically discussed in the McNally patent, but are at the heart of Subscriber Access Facilities (SAFs) 12 and are attached to NMSs 14 to provide an interface between operators and NMSs.)
NMSs are responsible for overall control and monitoring of the EFTPOS system; WSs are used by the network provider organization and service provider organizations to control and monitor particular equipment and communication paths for which they are responsible. As described in the McNally patent, the NACs can be dynamically reconfigured and can report their present status; operators and administrators at the WSs may enter commands to reconfigure the systems or commands requesting information on the current status of the systems. Commands originating at a WS are passed to an NMS for verification that the action or information requested is within the purview of the requesting organization, and are acted upon by the NMS following that verification.
The WSs and NMSs have software running in them to effect the entry of such commands and the responses to them. Each particular type of command typically invokes a particular path through the software, causing the execution of executable paths that are provided to perform particular functions required for a particular command. A software entity dedicated to a discrete function is known in the software arts as a "process".
WSs and NMSs are distributed throughout the geographical area served by the system. The NMS in a particular region of the geographical area generally exercises direct control and monitoring of the POS terminals and NACs in that particular region. A request pertaining to such a terminal or NAC and originating from a process in a WS or NMS in a different region must be forwarded over the communications network to a process in the NMS having cognizance of the target NAC, and a response must be forwarded back to the requesting process.
To enhance reliability of the EFTPOS system, provision exists to reconfigure NMSs, and the allocation of other system equipments to particular NMSs, in the event that one of the NMSs fails. The total number of NMSs is a function of total network size, within the constraints of system considerations: it would be operationally convenient to employ a small number of NMSs, each managing as many entities as possible, because this minimizes the number of inter-NMS transfers; but this represents a large unit of potential failure, and to guard against this a larger number of NMSs, each managing fewer entities, might be employed even though this is operationally inconvenient because it increases the number of inter-NMS message transfers required.
In the present embodiment, most of the entities being managed by NMSs are NACs, each NMS typically managing hundreds of NACs. NACs do not require constant management; management of a NAC typically entails the occasional forwarding of configuration commands, and periodic (presently, on the order of once every ten minutes) polling of each NAC's status. However, a NAC may occasionally request immediate attention, as when it detects an alarm condition and forwards an alarm message to an NMS, which may constitute a request for corrective action.
NMSs are also managed entities, inasmuch as each NMS monitors the others checking for continued functionality, and inasmuch as the NMSs exchange and respond to configuration information with each other.
Should an NMS become non-functional, requests and messages destined for it will not be honored until the system is reconfigured to reallocate the processes formerly performed by that NMS. There might therefore be a period of time during which alarm messages forwarded from a NAC to an NMS are lost--namely, the period from when that NMS becomes non-functional until the completion of the system reconfiguration to reallocate that NMSs functions. Although a NAC could retransmit those messages after such reconfiguration, it would be unable to request corrective action for the alarm condition until such time as reconfiguration is effected.