Voice over Internet Protocol (VoIP) is a general term for a family of transmission technologies used to deliver voice communications over IP networks such as the Internet or other packet-switched networks. Other terms frequently encountered and synonymous with VoIP are IP telephony, Internet telephony, voice over broadband (VoBB), broadband telephony, and broadband phone.
Internet telephony refers to communications services—voice, facsimile, and/or voice-messaging applications—that are transported via the Internet, rather than the public switched telephone network (PSTN). The basic steps involved in originating an Internet telephone call include conversion of the analog voice signal to digital format and translation of the signal into Internet protocol (IP) packets for transmission over the Internet; the process is reversed at the receiving end.
VoIP systems employ session control protocols, such as the Session Initiation Protocol (SIP), to control the set-up and tear-down of calls as well as audio codes which encode speech allowing transmission over an IP network as digital audio via an audio stream. The advantage to VoIP is that a single network can be utilized to transmit data packets as well as voice and video packets between users, thereby greatly simplifying communications.
SIP is an open signaling protocol for establishing many kinds of real-time and near-real-time communication sessions, which may also be referred to as dialogs. Examples of the types of communication sessions that may be established using SIP include voice, video, and/or instant messaging. These communication sessions may be carried out on any type of communication device such as a personal computer, laptop computer, telephone, cellular phone, Personal Digital Assistant, etc. One key feature of SIP is its ability to use an end-user's Address of Record (AOR) as a single unifying public address for all communications. Thus, in a world of SIP-enhanced communications, a user's AOR becomes their single address that links the user to all of the communication devices associated with the user. Using this AOR, a caller can reach any one of the user's communication devices, also referred to as User Agents (UAs) without having to know each of the unique device addresses or phone numbers.
Since untrusted networks may be used to carry packets of information (i.e., either voice or data information), packet filtering is often employed in enterprise networks to ensure that malicious packets do not enter the enterprise network from an untrusted network thereby compromising network devices and the network itself. Packet filtering is also referred to as application filtering since a particular application is usually employed to filter packets as they enter an enterprise network. Most methods of application filtering are completely dependent on packet contents to apply filtering rules. Application filtering based on packet content along poses several limitations such as:                (1) Performance Overhead: Application filtering heavily uses regular expression searches for deep packet inspection in the application payloads, which is very processor intensive. Also, by applying application filtering blindly to all packets received at the enterprise network can cause significant performance overhead.        (2) Security Holes: Packet contents can be forged which makes it possible for an attacker to exploit application filtering. As one example, packet contents can be forged to get white list treatment from application filtering when that packet should not have received such treatment.        (3) Inflexible: Filtering rules have limited flexibility as packet contents usually have limited information. For example, packet contents do not provide any information whether a connection coming from an IP address is a multiplexed connection or a simplex connection and it is not possible to effectively apply strong rate limit policies for simplex connections.        
There have been some suggestions to modify traditional application filtering mechanisms to further consider trust scores or trust levels in addition to considering packet contents. The currently proposed modifications create filtering rules once the trust score is determined. Previous art suggests adding dynamic filtering rules based on an activity/trust level derived from packets/connections. Currently available approaches of implementing trust-based filtering rules are very inefficient. For instance, if a VoIP server supports connections to 10,000 telephones, currently available filtering mechanisms will end up creating 10,000 different filtering rules or more if more than one filtering rule is applied to each telephone. The generation of these filtering rules will significantly hurt the filtering performance and will make trust-based application filtering rules unusable. Therefore, most enterprise networks will not be able to employ currently available trust-based filtering solutions.