(1) Field of the Invention
The present invention relates to a communication statistic information collection apparatus and, more particularly, to a communication statistic information collection apparatus to be connected as a node apparatus to a communication network to collect statistics information, such as the number of packets and the number of bytes, individually for each of packet flows.
(2) Description of the Related Art
The Internet has established its position as an important social infrastructure and begun to be applied not only to conventional Best-Effort based data communication but also to data communication which requires quality assurance of communication of data including real time data such as voice or video data and transaction data for basic industries. As broader bandwidths are used for access lines based on ADSL (Asymmetric Digital Subscriber Line) and FTTH (Fiber To The Home) technologies, an amount of data communication also tends to increase.
Against this backdrop, communication services enterprises such as carriers and ISPs (Internet Service Providers) require a network monitoring function for collecting and analyzing statistics information such as an amount of data communication to recognize the state of communication over a network. In particular, there is a great demand for the advanced function of collecting and analyzing statistics information individually for each group of communication data (hereinafter referred to as a flow) which is classified according to the source and destination of data, an application, quality level, and the like.
The use of such flow-by-flow statistics information allows the carriers and the ISPs to check the state of quality assurance of communication at the time of providing a quality-assured communication service. In addition, it also allows traffic engineering (hereinafter abbreviated as TE) which responds to an increase in the amount of data by effectively using limited network resources. It further allows provisioning which systematically prepares network resources by estimating customer demands and promptly provides the network resources in response to requests from users for communication bands, services, and the like, the detection and analysis of an attack to network resources from an unauthorized user, billing, and the like. The statistic information collecting function is normally provided in a communication node apparatus, which controls packet transfer over a network, such as a router, a switch, or the like.
As a method for collecting flow statistics information, a cache-type flow statistics technology described in, e.g., a publication entitled “NetFlow Overview”., Cisco Systems, Technical Marketing Internet Technologies Division, February 2003PP 1-109 (Non-Patent Document 1) has been known. A statistic information collection system using the above conventional technology is comprised of a plurality of statistic information collection apparatus arranged in distributed relation over a network and a collector apparatus which analyzes traffic over the entire network based on statistics information reported periodically from these statistic information collection apparatus.
The router having the cache-type statistic information collecting function is provided with a flow table for identifying a flow to which a received packet belongs and searches, every time a packet is received, the flow table by using a searching key composed of a combination of the header information items of the received packet, the input line interface number of the packet, and the like. The flow table is composed of a plurality of flow entries prepared for individual flows. Each of the flow entries defines a flow identification condition by a combination of, e.g., a source IP address, a destination IP address, a protocol type, aTOS (Type of Service) value, a source TCP (Transmission Control Protocol)/UDP (User Datagram Protocol) port number, a destination TCP/UDP port number, an input line interface number, and the like.
When the received packet matches any of the flow identification conditions registered in the flow table, statistics data such as the number of packets and the number of bytes is updated in the statistics information entry prepared in association with the flow identification condition. When the received packet does not match any of the flow identification conditions on the flow table as a result of searching the flow table, a new flow entry including, as the flow identification condition, the combination of the header information items, the input interface number, and the like serving as the searching key is added to the flow table. The contents of the statistics information entries are periodically monitored and the contents of the statistics information entry in which, e.g., the statistics data has remained unchanged for a certain time or longer are transmitted as statistics information data to the collector apparatus. The statistics information entry that has already been reported to the collector apparatus and the flow entry corresponding thereto are deleted from the table by an aging process.
As another method for collecting the flow statistics information, a sampling-type flow statistics technology described in, e.g., RFC3176 (Non-Patent Document 2) of Internet Engineering Task Force (IETF) has been known. In a sampling-type statistic information collection system, each router selectively transfers a copy of a received packet to a collector apparatus where a flow is identified from the copy packet and collection and analysis of statistics information are performed.
Each of the routers composing the sampling-type statistic information collection system samples the received packets in accordance with a sampling rate set preliminarily by a network administrator and transfers the copies (copy packets) of the received packets that have been sampled to the collector apparatus in a prescribed encapsulation format. There are also cases where each of the encapsulated packets includes additional information for flow identification such as, e.g., the input line interface number of the received packet, an output line interface number specified from the header information of the received packet, and the IP address (Next Hop IP address) of the router to be the next transfer destination. The collector apparatus extracts the copy packet from the encapsulated packet transferred from the router, identifies the flow based on the header information of the copy packet and information for flow identification added as required, and updates the flow-by-flow statistics information.
In the conventional cache-type statistic information collection system mentioned above, the combination of the flow identification condition and the information factors used for the searching key are roughly fixed so that each of the routers collecting the statistics information uniformly collects the flow-by-flow statistics information individually for each of the received packets. This leads to the problem of increasing the number of flow entries registered in the flow table and the number of statistics information entries.
In general, flow identification condition information necessary for flow identification accounts for about 200 to 600 bits. As a method for searching the table at a high speed by using such multi-bit information as the searching key, there is a searching method using a CAM (Content Addressable Memory). The CAM method enables high-speed matching of a bit pattern indicative of the flow identification condition registered in the memory with a bit pattern given as the searching key. However, since the CAM is higher in per-bit cost than a normal semiconductor memory, the problem is encountered that, when the number of flow entries in each of the routers is increased as in the conventional cache-type statistic information collection system, numerous CAM chips are required to constitute the flow table and router cost becomes high.
Moreover, because the conventional cache-type statistic information collection system can use only the roughly fixed combination of header information items as the flow identification condition, it becomes difficult to use information factors other than the information items described above, e.g., a TCP flag for identifying a control packet type in a TCP connection, a destination MAC (Media Access Control) address and a source MAC address in an Ethernet header, a VLAV (Virtual LAN) identifier for identifying a virtual LAN, and the like for flow identification.
In the sampling-type statistic information collection system, on the other hand, flow identification is performed at the collector apparatus so that flow classification using header information in a higher protocol layer, which is impossible with the cache type wherein flow identification is performed at the router, and the collection of statistics information based on the analysis of the content of packet data are enabled. In the sampling type, it can also be specified whether sampling for generating a copy packet should be performed for each of the input line interfaces of the received packets. This allows the enhancement of the accuracy of flow statistics by, e.g., limiting packets for which statistics information is to be collected to the packets received from a specified input line.
In the conventional sampling-type statistic information collection system, however, the accuracy enhancement of flow statistics at the collector apparatus is limited because it is impossible to specify whether sampling should be performed based on a unit other than the input line interface, e.g., based on a protocol type. According to the prior art technologies, it is also necessary for the network administrator to preliminarily set whether sampling should be performed, which leads to the problem that the setting of whether sampling should be performed cannot be changed dynamically depending on a traffic situation.