Wireless mesh networks typically support multiple different applications including video, voice and data. Different applications have different traffic characteristics and different requirements. For example, voice traffic is sensitive to delays, jitter and packet loss. Long packet delays or latencies can adversely affect the end-user perception of voice quality. Similarly high levels of packet loss or jitter can adversely impact the end-user perception of voice quality. High levels of packet loss or high latencies can also result in calls being dropped, which is undesirable.
For this reason, it is frequently desirable to apply different levels of priority to different applications and to treat these classes of applications differently within the wireless mesh network. Various mechanisms to differently prioritize voice or video traffic exist. Additionally, several standards including IETF DiffServ and IEEE 802.11e have been developed to mark packets at Layer 3 and Layer 2 to distinguish between different classes of traffic so that they can be treated differently as they traverse the infrastructure.
Some standards do exist. For example, a device conforming to the IETF Diffserv standards applies DSCP tags to its IP packets to allow network infrastructure to identify the class of traffic being carried in the packet. A device conforming to the IEEE 802.11e standard applies 802.11e tags to its IP packets to allow network infrastructure to identify the class of traffic being carried in the packet.
However, adoption of these standards is not uniform. Several handsets and other mobile devices do not uniformly or correctly tag packets, making it difficult for the network infrastructure to accurately identify the type of traffic and to apply traffic prioritization rules based on the classification.
Further complicating the classification problem is the fact that many VoIP applications and implementations do not communicate over well-known ports but rather dynamically determine what ports to use as part of the call setup procedure. Some network infrastructure devices perform state-full packet inspection of packet flows to identify the ports to be used during the call based on snooping the initial control protocol exchanges. While this approach works well for many well-known VoIP implementations, it is limited in utility because several VoIP implementations encrypt control protocol exchanges, making them harder to snoop. In addition, voice implementations such as UMA carry phone calls over an IPSec VPN, making it hard or impossible to snoop the initial call-setup messages. Furthermore, stateful packet inspection is processing-intensive and may be hard to accomplish on processor-constrained systems.
It is desirable to have a system for reliably classifying voice calls over IP without relying on accurate tagging by the end-points or relying on the ability to snoop call-setup exchanges.