Data communication networks allow information exchange and sharing of computer resources, and thus enable an organization to take advantage of its total computing capabilities. It is increasingly common for computer resources to be arranged into local area networks (LANs), especially when data transfer is required among several resources, or stations, located at various places within a building or cluster of buildings.
Because organizations often either use computer equipment made by a number of different manufacturers, or desire to exchange information with other organizations that use different equipment, it became quite apparent in the late 1970's that a way to support high-speed data communication between different types of computers would be needed.
This prompted the Institute of Electrical and Electronic Engineers (IEEE) to begin its Project 802. The IEEE quickly reached two conclusions. First, because of diversity in design, getting different computers to communicate is a complex problem. It requires architecture decisions not only at low levels, such as agreeing upon suitable modulation schemes, but also at higher levels. Second, no single architecture is ideal for all applications.
The IEEE thus developed a LAN reference model having three "layers". A first layer, called the physical layer, is concerned with the nature of the transmission medium. A second layer, called the media access control (MAC) layer, is concerned with the details cf signalling along the physical layer. Messages are exchanged, among many stations, in groups of elemental symbols. The basic message is called a frame at the MAC layer, with allowable frame types include both control frames and data frames. Data frames contain the information which is to be exchanged over the broadcast network, while control frames are used to issue instructions to each station, primarily to insure that no two stations attempt to transmit at the same time. A third layer, the logical link control (LLC) layer, is concerned with establishing, maintaining, and terminating logical links between stations.
The IEEE also concluded that no single MAC-layer architecture would be ideal for all situations. Performance can be sacrificed for lower cost in some applications, such as the typical office, but in other environments, such as the typical factory, users will spend more money to obtain a network which is more robust. The IEEE 802.4 Token-Passing Bus Access Method was developed for these critical environments.
Even though the 802.4 standard specifies a fairly robust communication environment, failures still occur due to equipment malfunction, network mis-operation, or programming errors. There is a need to identify which station is the source of such failures, particularly since a single malfunctioning station may prevent use of the network by other stations. The failure source often needs to be located quickly, especially when the profitability of the organization is critically dependent upon the operation of the LAN, such as is often the case in a manufacturing environment.
The failure identification problem is further exacerbated by the presence of equipment manufactured by multiple, independent companies. Although all such LAN equipment operates in accordance with a standard protocol, it is often difficult to consistently obtain diagnostic information about each station in such a situation. This can be especially true if several vendors have chosen to implement the standard protocol in different ways, or require conflicting diagnostic procedures.
It is also necessary to understand the utilization of the LAN in order to locate and correct performance bottlenecks. Such bottlenecks often occur due to load imbalances, and especially those caused by heavy traffic to and from certain stations, called bridges. Bridges serve as gateways for messages from one network to another, and thus are often a bottleneck. As a result, network managers often seek answers to questions such as (1) To what degree does each station utilize the medium? (2) Should the network be broken into multiple, interconnected networks for load balancing? and (3) How much of the traffic is being routed through bridges?
Certain diagnostic tools, called monitors, are presently available to help identify and isolate network failures as well as performance bottlenecks. Monitors are generally of two types, with each type having distinct disadvantages.
With the first type of monitor, diagnostic and performance information is collected in some form by each station. This information is then transmitted to a central location and combined with information from other stations.
There are several drawbacks to this approach. First, the information is physically difficult to collect from each station. If the LAN itself is used to transmit the information, certain types of LAN failures will also prevent collection of diagnostic data, and thus prevent proper diagnosis of the trouble. On the other hand, if a secondary path is used to collect information, expensive and cumbersome hardware must be added. Finally, the types of data which each station can collect may often be limited by performance constraints. In the absence of a previously standardized or agreed-upon set of parameters to be maintained, the management information collected from equipment manufactured by different suppliers may not be compatible, and in the worst case, may even lead to conflicting conclusions about equipment malfunctions.
In fact, this is presently such a problem that several industry organizations are proposing management standards, which will specify which information must be maintained by each station, as well as how the information should be exchanged.
A second type of monitor attaches directly to the LAN and detects and stores data packet traces, much in the same manner as a logic analyzer. These monitors are sometimes capable of recording the number and type of frames transmitted by each station. However, they also have several disadvantages.
First, since these monitors do not automatically determine which station is the source of errors, they require an operator who is knowledgeable about the network protocol, at least enough to recognize that certain frame types should not occur in certain situations. The operator must typically program the monitor with a data sequence to be triggered on, and then must manually review the traces occurring after the trigger to determine the source of a problem. Thus, these monitors do not report problems in real-time, generally require programming to detect errors, and do not give automatic indication of the source of a problem.
Second, these monitors cannot automatically identify which frames are transmitted by stations that accesses the network through a bridge. This is because the source addresses of such frames are not that of the bridge itself, but rather that of an originating station located on the other side of the bridge. The bridge merely forwards these frames to the local area network, without modifying address fields in the frame.
Thus, there is an unmet need for a data communications network monitoring device which reliably and quickly identifies faults, without requiring a high level of operator expertise. The monitor should not require the use of station resources, and should not use the network itself to transmit diagnostic information. The monitor should avoid the need for requiring stations to observe agreed-upon management protocols. It should also measure network utilization not only by the directly attached stations, but also by stations connected to the network through bridges.