1. Technical Field
The invention disclosed broadly relates to the extraction of information from large bodies of data for high speed communication facilities. This invention is particularly well suited to the extraction of information which characterizes complex data communications networks.
2. Background Information
With the advent of FDDI, BISDN, and SONET, the day of gigabit computer communications network is here, and the day of the terabit network is fast approaching. These high speed network environments demand new and powerful tools that require information from the network to assist with network design, network management, network control functions, and network services. One extremely important problem to solve is how to monitor the raw data from one or more high speed communications channels and convert the data to useful "information," for a user, a service, input to an algorithm whenever it is required, etc. Previously, this problem has been viewed to be that of "real-time" network monitoring and performance evaluation. Network monitoring is defined as the extraction, processing, collection, and presentation of dynamic information with respect to the operation of a system. Monitoring information is then used by network performance management (usually an individual) to evaluate the state of network resources in real-time (usually via some type of display). Involvement by high skilled individuals, unfortunately, is required by the present day state-of-the-art.
Data collection requires the accumulation of information relevant to its use. Two approaches for network data collection are typically used:
1. Tracing an recording the actual data. The term "trace" refers to a record of all frames and bytes transmitted on a network, as well as environmental information. Two examples of environmental information include time stamps and control block information. A trace usually provides a complete picture of time dependent network behavior. PA1 2. Collecting statistical information only. Statistical information is parametric information that is usable in mathematical models for performance evaluation. Unlike trace data that keeps track of all information transmitted and relative timing information, statistical information is obtained by categorizing the data and keeping counters for each category. For example, we could categorize frames by frame length and count the number of frames of a particular length within a given time interval. Statistical approaches are not flexible and are typically geared to one particular usage (in the worst case just a user display). Statistical methods have well-known deficiencies and often loose part or all of the relevant information required (e.g. lost environmental information, timing references, activity dependencies). Statistics can alert you to the presence of a problem but all too often a trace is required for its diagnosis. PA1 1. Direct trace of network activity through memory to disk storage (we are assuming that present hardware technology will allow data capture at the media rate). PA1 2. Preprocessing of trace data in memory so that only a subset of all the available network activity is written to disk storage. PA1 1. Quantity of disk storage required. A 16 Mbit Token Ring could generate 2,000,000 bytes of data every second. A 600 Mbyte disk can be filled in 300 seconds (5 minutes). A 100 Mbit FDDI ring (just one-half of the dual FDDI ring) could generate 12,500,000 bytes of data every second a 600 Mbyte disk can be filled in 48 seconds. PA1 2. Read/write access time limitations for disk storage. Typical read/write access time is in the millisecond range, where data for a 16 Mbps Token Ring arrives in the microsecond range and FDDI approaches the nanosecond range today. PA1 3. Speed with which instructions can be processed. A 100 byte packet could arrive from a 16 Mbit Token Ring every 50 microseconds. A 10 MIP processor would only have 500 instructions between packet arrivals in which to process each packet. A 100 byte packet could arrive from a FDDI ring every 8 microseconds. A 10 MIP processor would only have 80 instructions between packet arrivals in which to process each packet. PA1 1. Performance problem determination and analysis: collect actual frames and their time relationships. This means that statistical information is simply not enough. (Statistics often represent just another generic symptom of the problem.) PA1 2. Performance monitoring: collecting statistical information and reporting the "appropriate" intervals. PA1 3. Benchmarking collects actual data but may use filters to gather only significant portions of this data. PA1 4. Performance tuning and optimization: collect actual data but may use filters to gather only significant portions of this data while preserving time dependencies. (Note, as network complexity grows, tuning may become unaffordable with present techniques.) PA1 5. Workload analysis and reporting: collect actual data or statistical data depending on specific requirements. PA1 6. Network sizing: collect actual data or statistical data depending on specific requirements.
Often, due to both the correlation of network activities and the "time dependent" nature of some network functions and services, the only tool previously available to capture all the required data was a trace. The trace approach for collecting network data has traditionally been accomplished via two methods:
These methods capture network activity so that an "after the fact" analysis on the captured network data can be done to obtain performance information, such as a performance assessment. To illustrate the limitations of traditional methods, we provide the following examples. Many consider that the data transfer rates (throughput) of existing networks bring performance analysis, performance monitoring, and performance problem determination techniques to their present technological limit due to:
The information extracted from a data communications network can be used in many ways. A few examples follow.
As network speeds increase, (e.g. FDDI, FDDI 1, SONET) it is becoming more apparent that traditional data collection approaches will no longer be adequate. The invention disclosed herein is designed to eliminate the necessity of tracing as a means of network information capture.
The advent of high speed media such as CSMA/CD, Token Ring, and FDDI, along with recursive enveloping of multiple architectures, has brought considerable complexity and has changed the very nature of networking. Basically, the world is evolving to an encapsulation oriented, any-to-any network, using any media and any protocol at any time in any environment. We will refer to this environment as a KNA (Kluge Network Architecture) environment. Unlike older (and often proprietary) monolithic networks, heterogeneous multi-protocol and multi-vendor (KNA) networks do not come bundled with all the necessary management functions for monitoring, controlling, and diagnosing network problems. The marketplace focuses on KNA functionality. Thus, these environments lack an overall Information Collection Architecture and direction because of the multiple products and protocols they use, that are being invented and modified on a daily basis. Information necessary to mange the environment is critical but constantly changing. In a KNA environment, only a totally flexible physical layer "tap," that is independent of hardware and protocol changes makes sense.