As computing devices continue to increase their capacity for processing huge amounts of data, the efficient and fast administration and monitoring of such data becomes ever more challenging. As computers are designed to handle exponentially greater amounts of information, the ability to be able to quickly and efficiently collect data regarding the operation of that computer becomes more and more difficult. The problem is found throughout the world of computers and presents many challenges to administrators charged with monitoring the communication networks that interconnect the millions of networked devices used by millions of people around the world and central to the operations of our modern economy. As the amount of data proliferates, the need for collecting and recording that data becomes more and more difficult and commensurately more and more essential. Recording all this data requires an innovative approach.
Network administrators and others who have responsibility for managing today's communication devices are often overwhelmed at the amount of data collected for measuring and storing network information. The collected data records the state of various measured variables and maintains logs of information that are useful in understanding usage patterns in the network as well as diagnosing the source of problems when a network device defaults during the course of its operation. There is a need for keeping accurate records of all performance data, and any other collected data, produced by a device which requires constant monitoring of such data. Uses of such data are manifold and include those related to diagnosing problems in a device and understanding usage patterns of the device. As the time period of monitoring is extended and the amount of data increases, significant strain is placed on the processor and storage resources for information gathered from the counters. To achieve effective network monitoring, steps need to be taken to ensure that the performance logs of measured network parameters are both accurate and comprehensive without creating excessive burdens on processors and storage devices that process or store such information.
A pure software solution for network monitoring may not be an optimal solution. For a variety of reasons, software implementations have various limitations including the fact that clock inaccuracy will result in the erroneous implementation of the software logic for recording counter information. Also, a pure software solution can put significant strain on the central processing unit (CPU) resources if the software needs to accomplish multiple functions.
Another potential solution for performance logging involves a system of recording all collected data and then performing a compression algorithm on the collected data to shrink it down to a more manageable and storable size. The problem with this data compression-based solution is that compression algorithms themselves typically place a heavy compute burden on the logging infrastructure. Therefore the data compressing solution imposes a significant strains on the processor resources and as a result, problems of efficiency quickly arise as these resources become more and more taxed as the amount of data tracked, and as a result the amount of data compressed, increases.
Another currently used solution for performance logging involves recording only a selected portion of the entire set of data produced by the applicable counters. This is accomplished in a variety of ways including by shutting off the collection of data periodically, or only recording for predetermined amounts of time. The problem with this selective recording solution, like all lossy data solutions, is that not all the data is being recorded, hence the recording log may not have all the information that is necessary for accurately understanding usage of the device or successfully diagnosing the source of an encountered problem. Hence, the data stored may not be sufficiently robust to conduct a comprehensive and accurate performance and diagnostic analysis.
In the case of networking devices, traffic load must be monitored accurately to effectively design and architect traffic management solutions. For example, the bit-rate of the traffic and the timescale over which such a measurement is made determines how much information can be deduced from it. If the time-scale is relatively long with longer intervals, then only the mean traffic load can be deduced and it can be much more difficult to analyze network delays and packet-drop rates which require a more granular approach. Hence, the traffic must be sampled at the rate of packet queuing. Timestamps must typically be set up to record on the order of every tens of milliseconds. However, making such accurate and frequent recording of monitored data poses challenges in terms of software complexity, processor burdens, storage requirements and limitations on hardware resources.
There is a need to provide an efficient mechanism for efficient, lossless logging of collected information and data from networking monitor that will be useful for both diagnostic analysis and performance analysis and that will minimize storage and processing requirements. The stored data should be robust enough for further diagnostic and performance analysis.
There is also a need for a method and means for optimized network monitoring, wherein the benefits of robust capture of data is not correspondingly offset by a significant increase in the resources required for data capture.