1. Field of the Invention
The present invention relates to a method of generating Internet Protocol (IP) flows, and more particularly, to a method of generating IP traffic flows based on a time bucket, which is configured to generate flows using all IP packets arriving in a predetermined time bucket and collected from a high-speed line.
This work was supported by the IT R&D program of MIC/IITA[Project number: 2006-S-065-02, Project Title: Development of High-Speed Wireline (10 Gbs, 2.5 Gbps) for Charging].
2. Description of the Related Art
The term “flow” being used in Internet generally can be defined as a set of IP packets having common characteristics, which are collected from traffic flowing on an Internet line. The traffic characteristic indicates a fixed value of an IP header and an L4 (transmission control protocol (TCP) or user datagram protocol (UDP)). The meaning of having common characteristics is that such fixed values of IP packets are the same. Accordingly, IP packets having the same fixed values are considered to have the same traffic characteristic and a flow is formed using those IP packets.
For example, a flow can be defined a set of IP packets having the same IP source address, IP destination address, protocol, IP source port number, and IP destination port number. Accordingly, such a flow includes general information on traffic flowing on the Internet line. Moreover, if additional information is added to the flow, information on all usage patterns and behaviors of traffic flowing on the Internet line can be collected and analyzed.
The information to be added to a flow may include the start time and the end time of a flow, the number of IP packets, the total amount of bytes, and an interface number. Moreover, information on IP packets forming a flow may be added. However, in the case where all information on each IP packet is transmitted, the amount of information is too much for a high-speed line. Accordingly, most flow measurement systems generate a flow using only general information of the flow, i.e., basic statistical information, terminate the flow, and transfer a terminated flow to an analysis system to analyze the information on the terminated flow.
To generate a flow by collecting traffic flowing on the Internet line, the following different timeout mechanisms are used to generate and terminate a flow using received IP packets. However, flows generated using the following timeout mechanisms include basic statistical information but do not include information on each IP packet.
The first timeout mechanism is a FIN timeout mechanism. In the FIN timeout mechanism, when a predetermined FIN timeout (basic time: 2 seconds) elapses after a last FIN or RST packet among IP packets of a predetermined flow arrives, a FIN timeout flow is generated. A flow measurement system terminates the FIN timeout flow and transfer a terminated FIN timeout flow.
The second timeout mechanism is an INACTIVE timeout mechanism. In the INACTIVE timeout mechanism, when a new IP packet does not arrive even after an INACTIVE timeout (basic time: 15 seconds) elapses after a last IP packet of a predetermined flow arrives, an INACTIVE timeout flow is generated and terminated, and the terminated INACTIVE timeout flow is transferred.
The third timeout mechanism is an ACTIVE timeout mechanism. In the ACTIVE timeout mechanism, when the flow continues for too long as IP packets continuously arrive during an ACTIVE timeout (basic time: 30 minutes) after a first IP packet of a predetermined flow arrives, an ACTIVE timeout flow is generated and terminated, and the terminated flow is transferred.
The fourth timeout mechanism is a MEMORY timeout mechanism. In the MEMORY timeout mechanism, when a new flow cannot be generated due to memory starvation of a flow measurement system, in the order of oldest first, an MEMORY timeout flow is generated and terminated, and the terminated flow is transferred.
However, in the case of using the above described timeout mechanisms, a flow is time-continuously generated, the generated flow is not transferred until much later (in the case of the ACTIVE timeout flow: 30 minutes) after traffic actually passes according to the time point at which the flow is terminated. Thus, an analysis system, which performs flow analysis based on a predetermined analysis period, must have an analysis period longer than an ACTIVE timeout value, and thus the analysis system must have an analysis period longer than about 1 hour. That is, due to non-real time generation and transfer of the flow, the result of analyzing flow information at a predetermined time can be known after a minimum 1 hour.
On the other hand, when flows are continuously generated regardless of an analysis period of an analysis system, it is difficult to analyze traffic statistical information and flows based on an analysis period. Since the continuously generated flows extend over a plurality of analysis periods, if the flows include general information but do not include information on all IP packets of the flows, traffic cannot be divided according to the analysis periods. Also, if the flows include information on all IP packets, the IP packets must be divided according to the analysis periods, thereby affecting performance of an analysis server.
In addition, flows generated in a previous analysis period cannot be analyzed until all flows generated in a current analysis period are transferred. This is because the flows generated in the previous analysis period include flows that extend over the previous and the current analysis periods. Thus, after all the flows extending over the previous and the current analysis periods are generated and transferred (i.e., after the current analysis period finishes), the flows generated in the previous analysis period can be analyzed.
For example, if an analysis period is 1 hour, after minimum 2 hours, an analysis result can be known at the time point at which flow analysis is completed. If the analysis period is shortened in order to solve the above described limitation, the ACTIVE timeout value of a flow measurement system must be shortened. Such a method has only a small effect that causes a actual long flow to be divided into a plurality of flows, but cannot solve the limitation. It is practically impossible to generate flows including information on each IP packet due to transmission bandwidth, CPU processing speed, speed of storage unit, storage capacity and so on in connection with the large amount of information.
Therefore, in order to the above described limitations, a flow generation unit is required, which generates flows without generating flows over a plurality of analysis periods by generating the flows using all IP packets within a predetermined analysis period based on a time bucket timeout flow generation mechanism.
Also, in order to the above described limitations, a unit is required, which is applicable to a flow measurement system for IP traffic flowing on an Internet high-speed line by implementing generation of a time bucket based flow, a FIN flow and an INACTIVE flow using micro-coding of a Micro Engine (ME) that is executed on network process.