1. Field of the Invention
The present invention relates to methods for Real-time Transport Protocol (RTP) packet authentication on a packet data network. In particular, the invention relates to methods for preventing toll fraud, privacy compromise, voice quality degradation, or denial of service (DoS) on Voice over IP networks.
2. Description of Background Art
Telephony via Voice over IP (VoIP) offers tremendous potential in rich features and cost savings. However, the leveraging of data networks and their corresponding communication protocols also carries attendant security vulnerabilities. Furthermore, VoIP protocols for signaling and media transport themselves present additional vulnerabilities that might lead to toll fraud, privacy compromise, voice quality degradation, or denial of service (DoS).
In particular, the Real-time Transport Protocol (RTP) used as the basis for media transport is susceptible to several attacks, including third-party snooping of private conversations, injection of forged content, and introduction or modification of packets to degrade voice quality.
A description of RTP can be found in                H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, “RTP: A Transport Protocol for Real-Time Applications,” IETF RFC 3550, July 2003, <http://www.ietf.org/rfc/rfc3550.txt?number=3550>.        
In response to concerns about these RTP vulnerabilities, the Internet Engineering Task Force (IETF) Audio/Visual Transport Working Group proposed the Secure Real-time Transport Protocol (SRTP), which provides confidentiality, message authentication, and replay protection for RTP traffic. As with SRTP, SRTCP provides similar security for RTP Control Protocol (RTCP) traffic.
The SRTP specifies Advance Encryption Standard (AES) encryption of the RTP payload and a message authentication hash of the header and the encrypted payload using Keyed-Hashing for Message Authentication (HMAC-SHA1) to achieve enhanced security. Avaya products are supported by this newer AES encryption as well as the earlier implemented Avaya Encryption Algorithm (AEA) encryption. MAC-SHA1 is the default message authentication code, and its implementation is mandatory. However, optional message authentication codes are permitted. HMAC-SHA1 produces a 160-bit digest and recommends that no more than 80 bits to be truncated from the least significant end. However, 128 bits can be truncated (resulting in a 32-bit authentication tag) for bandwidth efficiency if the following conditions are met: (1) the RTP payload is stateless, (2) an attacker is unlikely to be able to intelligently modify the SRTP ciphertext, (3) and no data forwarding or access control decisions are made based on the RTP data.
Further descriptions of S RTP, AES, HMAC-SHA1, and AEA can be found respectively, in                M. Baugher, D. McGrew, M. Naslund, E. Carrara, and K. Norrman, “The Secure Real-time Transport Protocol,” IETF RFC 3711, March 2004, http://www.ietf.org/rfc/rfc3711.txt?number=3711;        National Institute for Standards and Technology (NIST), “Advanced Encryption Standard (AES),” FIPS Pub 197, http://csrc.nist.gov/publications/fips/fips197/fips-197.pdf;        H. Krawczyk, M. Bellare, and R. Canetti, “HMAC: Keyed-Hashing for Message Authentication,” IETF RFC 2104, February 1997, http://www.ietf.org/rfc/rfc2104.txt?number=2104; and        R. Gilman, “An Efficient Encryption Algorithm for Audio Version 1.2,” Avaya COMPAS document93923, Jul. 3, 2002.        
One potential concern for SRTP is the overhead imposed by the authentication hash. As seen in FIG. 1, the entire header and encrypted payload are hashed via the HMAC-SHA1 algorithm to produce an 80-bit or 32-bit message authentication tag. The tag must be computed so that each packet can be authenticated.
Thus, one possible denial of service attack is to bombard a target with a series of forged packets, each of which contains an improper authentication tag.
FIG. 2 shows the steps for computing an authentication tag using HMAC-SHA1. The main computation includes hashing N=L+3 blocks of 512 bits, where L=ceil(M/64) and M is the number of bytes in the RTP header and payload covered by the authentication tag. The full computation also includes a few XOR, padding, copying operations, but the total execution time is heavily dependent on the hashing operations. The main processing for one SHA-1 hash consists of four rounds of 20 steps each. These four rounds contain a total of 740 32-bit logical and arithmetic operations (AND, OR, NOT, XOR, modulo addition). In addition to the hash operations, the two 512-bit XOR operations require 32 32-bit logical XOR operations.
However, these two XOR operations can be optimized away. An optimization is possible for small values of M. Each XOR operation in FIG. 2 is followed by a hash operation. The result of the XOR and hash operations can be precomputed and used to replace the original initialization vector (IV) corresponding to that hash operation. These two values only need to be calculated once because the Key and ipad/opad values never change after the initial SRTP exchange of keys. Thus, two hashes can be avoided in the steady state, which reduces the computational overhead to hashing L+1 blocks. For G.711 speech coding at 8000 Hz, each RTP packet payload contains 160 8-bit samples or 160 bytes. The RTP header is 12 bytes long, assuming no CSRC fields. Thus, each RTP packet contains 172 bytes, which means L=ceil(172/64)=3 and N=L+3=6. With the optimization described above, N=L+1=4.
The authentication for each SRTP packet requires 6 SHA1 hash operations, which equates to approximately 740*6=4560 logical operations, which takes 76 ms on a 60 MHz processor, not counting control flow instructions. An attacker can take advantage of this computational overhead by bombarding a victim with forged packets that invoke the authentication process with the sole intent to consume processor cycles.