1. Field of Invention
The present invention relates to a network firewall apparatus that detects peer-to-peer (P2P) application network traffic from a source host on a network to destination hosts external to the computer network.
2. Discussion of Conventional Known Methods
According to Sydnor, Knight, and Hollaar “A Report to the USPTO from the Office of International Relations” p 47 Conclusions:
Government and Corporate IT-Security Managers: For anyone concerned about protecting the security of sensitive data or the security of computer networks, questions about whether features that can cause users to share files unintentionally were intended to do so are largely irrelevant. In either case—and as DHS has acknowledged—filesharing programs present a tripartite threat to the security of data and networks.                Filesharing programs can cause inadvertent sharing that can compromise entire networks: In networked environments, the effects of the “features” discussed above can be particularly devastating. For example, on some networks, a user who tries to store downloaded files in a folder like “Documents and Settings” can end up “sharing” all files created by all users of the network. Even home use of Filesharing programs can compromise government or corporate networks: Usability and Privacy notes that if a home computer has a VPN connection to a corporate or governmental network, a user can inadvertently “share” the portion of the network available through the VPN connection.        Filesharing programs can infect computers or networks with malicious code: To avoid vicarious liability for pervasive infringing uses of their programs, distributors of file sharing programs stopped registering or uniquely identifying individual users of their programs. Distributors knew that this would encourage distributors of malicious code to use popular downloads as a means to compromise computers and networks: “As you would expect, when files often come from anonymous and uncertified sources, the risk of that file containing a virus greatly increases.” As a result, research by the security company TruSecure found that 45% of popular downloaded files concealed malicious code.        Filesharing programs can contain vulnerabilities that hackers can exploit to steal sensitive data: DHS warns that Filesharing programs “can result in network intrusions and the theft of sensitive data . . . . [F]ederal government organizations have discovered the presence of P2P software on compromised systems while investigating cyber intrusions.” McGill University warns that some Filesharing programs are developed by “ragtag teams following ad hoc plans, resulting in barely functional, extremely buggy clients that are prone to security breaches.”79 All three of these risks increase because Filesharing programs—unlike most others—often appear to be designed to go where they are not wanted and to evade the security measures that could exclude them.” . . . “There will almost never be a legitimate business or governmental justification for employee use of Filesharing programs. Nevertheless, preventing employees from using these programs on corporate or government networks can be both difficult and expensive.”        
Peer-to-peer (P2P) applications are frequently considered unwelcome guests in a network because they consume bandwidth. Network administrators have an obligation to protect and manage their resources as well as to avoid liability for piracy or other damage to intellectual property rights such as copyright. In addition to security concerns, peer-to-peer applications have the potential to degrade quality of service for all users in a network. As noted above, unsophisticated users of peer-to-peer applications may be manipulated into inadvertently exposing personal or confidential information.
Conventional firewalls are used to prevent network intrusion and the inward movement of malware. They are poorly architected to control the proliferation of peer-to-peer applications. Conventional firewalls may be used to block selected ports. They may also be used to block specific IP addresses or ranges of addresses. In practice they also depend on the receipt of black lists of IP addresses or ports to identify a server having an application which is objectionable.
It is a characteristic of Peer-to-Peer (P2P) applications that they are designed to circumvent fixed barriers such as firewalls. There are no limit to the number of hosts employed for peer-to-peer applications so a list of IP addresses would be ineffective. And ports may be pseudo-randomly selected from a large number so blocking a specific port would not prevent a peer-to-peer application. And peer-to-peer applications quickly proliferate among many hosts which would make compiling a list of IP addresses futile.
Stacy 20050213570 discloses a method for filtering malicious data packets in Denial of Service attacks. In paragraphs [0009-[0011] Stacy discloses “[0009] As used herein, a dataflow is a stream of data packets that is communicated from a source node to a destination node. . . . [0010] . . . The hash table is typically organized as a table of linked lists, where each list may be indexed by the result of applying a conventional hash function to “signature” information. In this context, a signature is a set of values that remain constant for every packet in a data flow. For example, assume each packet in a first data flow stores the same pair of source and destination IP address values. In this case, a signature for the first data flow may be generated based on the values of these source and destination IP addresses. Likewise, a different signature may be generated for a second data flow whose packets store a different set of source and destination IP addresses than packets in the first data flow. Of course, those skilled in the art will appreciate that a data flow's signature information is not limited to IP addresses and may include other information, such as TCP port numbers, IP version numbers and so forth.
Each linked list in the hash table contains one or more entries, and each linked-list entry stores information corresponding to a particular data flow. . . . ”
In paragraph [0058] Stacy discloses “ . . . For example, the signature information extracted by the engine 524 may include, among other things, source or destination TCP port numbers, source or destination IP addresses, protocol identifiers and so forth.” In paragraph [0059] “The extracted signature information is then input to a hash-entry address generator 530 in the flow classifier. The hash-entry address generator includes a hash-function unit 532 that applies a predetermined hash function to the received signature information, thereby generating an n-bit resultant hash value.”
In paragraph [0068] Stacy discloses “ . . . In operation, the linked-list walker 526 locates a linked list in the hash table 600 using the list pointer 630 contained in the hash-table entry 610 whose memory address was generated by the hash-entry address generator 530. Then, the linked-list walker sequentially traverses (“walks”) the list's linked-list entries 650 until it identifies a matching entry that contains the packet's signature information 652 or until the end of the list is reached.”
In paragraph [0071] Stacy discloses “At step 732, a packet-identifier engine 522 in the flow classifier identifies the type of data packet 160 received at the network interface 210. At step 736, signature information is extracted from a predetermined set of fields in the packet's descriptors and headers, based on the identified packet type. For example, the signature information may include TCP port number, IP addresses, protocol versions and so forth. At step 740, the extracted signature information is forwarded to a hash-entry address generator 530, in which a hash-function unit 532 calculates a hash of the signature information, . . . The hash of the signature information is used to create an index in the hash table 600. ” Thus it can be appreciated that Stacy's linked list does not enable counting the number of destination ports utilized for a single destination Internet Protocol (IP) address since it is matching a hash which is either match or no match. Nor can Stacy's link list reveal if a peer-to-peer application source is trying to connect by sending to a large number of destination IP address. Thus it can be appreciated that what is needed is a way to determine that a Peer-to-peer application is trying to connect by transmitting to a non-repeating series of destination IP addresses or trying to evade detection by transmitting to a non-repeating series of destination ports after it has connected to a destination host.
Segel 20070133419 discloses in paragraph [0022] “The traffic flow controller may instead select a traffic congestion management function to be applied to all communication traffic of the communication traffic stream.” In paragraph [0027] “Determining may involve one or more of: processing the received communication traffic to determine its type, and determining whether the received communication traffic belongs to a communication traffic stream . . . ” In paragraph [0032] “The identifier of a communication traffic stream may include a source and a destination of the communication traffic stream.” In paragraph [0005] Segal discloses ‘ . . . examining the DiffServ Code Point field in the IP header of the packet” In paragraph [0060] Segal discloses “The expression “traffic stream” as used herein may refer to a communication session between two end points . . . A stream may be identified by source and destination IP address . . . and also use . . . IP port and protocol to distinguish different type of traffic between session end points. The phase “5-tuple” (of IP source and destination address, source and destination port, and protocol) is one example of a stream identifier . . . ” None of Segal's disclosures would distinguish a source sending packets to many diverse non-standard ports at a destination as a stream. Segal does not disclose measuring packets sent to diverse destination ports for a destination IP address as traffic type determination. In paragraph [0107] Segal discloses “The congestion management method 40 begins at 42 when communication traffic is received for transfer . . . At 44, a type of the received communication traffic is determined.” Thus it appears that what is needed is an improved method to determine a type of communication traffic other than to examine every packet of communication traffic which is received.
Bhikkaji 20070094730 discloses in paragraph [0011] a method . . . for preventing a worm attack in a network . . . by correlating the spread of IP addresses in a worm's randomly generated IP address space, along with the worm's packet signature, and a role reversal behavior. The role reversal behavior implies that the role of a port changes from initially being a target to being a propagator of the worm attack.” In paragraph [0014] A plurality of Worm Attack Identification caches . . . stores packets with a set of characteristics . . . the communication protocol, the IP address of the source, the IP address of the destination, the port address of the source, and the port address of the destination of the packet.” In paragraph [0015] “a count . . . for the number of packets . . . originating from a similar source IP address and source and/or destination port within a predefined timeframe.” In paragraph [0016] “compares the number of packets originating from a similar IP source address with a predefined first threshold (T1). First comparison module also compares the number of packets originating from similar IP source address with a predefined second threshold (T2).” Both thresholds are compared with the same measure: number of packets originating from a similar IP source address. In paragraphs [0022-0023] physical ports on access switches are disclosed. It is understood by those skilled in the art of Internet Protocol that the source and destination ports of IP packets are not physical ports. Bhikkaji does not disclose counting the number of destination ports utilized for each destination IP address. In paragraph [0042] Bhikkaji teaches away from addressing the problem by disclosing “the invention . . . can be tuned to determine if the role reversal is happening in a higher magnitude than is possible in a normal peer-to-peer application. This is necessitated in order to prevent any false-positives.” Thus it can be appreciated that what is needed is a method to detect a peer-to-peer application which is actively avoiding detection by hopping among many source or destination ports.
Furlong 20060167915 discloses a method to scan every character of every packet's payload to find a pattern match. However it would be impractical to scan every packet passing through a gateway to discover if a peer-to-peer application was operating within a network. Furlong does not disclose a method to efficiently determine whether a source within a network is at all generating peer-to-peer network traffic nor does it examine IP headers of a packet to determine if further analysis is desirable for the packet. Thus it can be appreciated that what is needed is a method to identify that such pattern matching as Furlong is needed and to limit the number of packets that consume resources operating the Furlong method of pattern matching.
Sebayashi 20070166051 discloses in paragraph [0002] “communication traffic matches predetermined conditions for detecting suspicious attacking packets is checked at a repeater device. When matching traffic is detected, the repeater device generates a signature indicating a transmission band restriction value of the detected suspicious attacking packet, sends the signature to an adjacent repeater, . . . and thereafter performs the process of restricting the transmission band of suspicious attacking packets identified by the signature.” In paragraph [0011] “a . . . unit that determines whether a number of packets that satisfy a condition of the signature received from the adjacent repeater device within a unit time exceeds a predetermined threshold . . . ” Yet Sebayashi fails to disclose the method of determining whether a packet satisfies a condition of the signature received at all. It is known that a conventional network attack protection from Denial of Service or Distributed Denial of Service (DDoS) expects many sources directing packets to one or a few destination hosts. A condition of the signature for a conventional DDoS defending system would include a small number of destination hosts. Sebayashi does not disclose a condition of NOT satisfying a signature as controlling passage of a packet. Thus it can be appreciated that what is needed is a method to determine when suspected a peer-to-peer application host may be attempting to connect to any one of a very large number of destination hosts each with a unique IP address.
Thus it can be appreciated that what is needed is a more flexible system to control traffic which adapts to the specific peer-to-peer traffic found in a local area network, which identifies potential sources of peer-to-peer traffic, which efficiently identifies attempts to connect peer-to-peer applications, and which disposes efficiently with packets suspected to contain peer-to-peer content.