Technical Field
The present invention relates generally to network traffic flow management, and more particularly, to network traffic flow prediction and regulation using machine learning.
Description of the Related Art
In computer networks, traffic flow behavior may exhibit a characteristic commonly known as “elephant-mice” flows. For example, in relevant sampling performed across different types of networks (e.g., data center networks, hybrid networks, software-defined networks (SDN), etc.), the majority of traffic flows (e.g., ˜80%) are conventionally small (e.g., less than 10 KB), although the majority of bytes transferred within a network are in the top 10% of large flows. Thus, the former flows (e.g., smaller than 10 KB) may be referred to as “mice” while the latter (e.g. large flows) may be referred to as “elephants”.
The elephants-mice flow feature is associated with abnormal network behavior and can cause application level performance degradation in data centers and other types of networks, and the detection of elephant flows in a network given a lack of explicit signaling from the flow-generating applications (which is the typical case) is not a trivial task in network management. Generally speaking, the elephant flow detection problem is related to the IP traffic classification problem, which tries to identify the application layer protocol (e.g. HTTP, P2P, VOIP, etc.) of each flow given the flow parameters such as TCP/IP ports, packet length and inter packet gap.
One conventional method for IP traffic classification involves examining flow ports and matching them to the standard ports defined by IANA (Internet Assigned Numbers Authority). However, a drawback of this approach is that some applications hide these flow ports and/or pass through a firewall by using these ports. Moreover, some applications use arbitrary ports for their connections, which would cause this method to fail. Another approach is based on payload inspection, which tries to match packet contents and/or signatures to well-known applications. This approach maintains updated information of application protocol changes, but it does not work if the packets are encrypted and/or if user privacy is desired to be maintained.