Voice over Internet Protocol (VoIP) was developed as a feature in the original Request for Comments (RFC) document that defines the Internet Protocol in the early 1980's (IETF RFC 760: Information Sciences Institute at University of Southern California, “DoD Standard Internet Protocol,” January 1980). Early versions of VoIP made use of relatively inferior voice compression technologies and suffered from many of the same network problems we are facing today such as Quality of Service (QoS), jitter, dropped calls, latency, and bandwidth constraints. While many of these problems have been addressed in more recent networking technology, security has remained a primary afterthought for VoIP.
There are currently two standards that define voice services over the Internet. The first is H.323 which is standardized by the ITU-T (ITU-T H.323 Standard, “Packet-based multimedia communications systems”, June 2006). The second is session initiation protocol (SIP) which was standardized by the IETF as RFC 3261 (IETF RFC 3261: J. Rosenberg, H. Schulzrinne, G. Camarillo, A. Johnston, J. Peterson, R. Sparks, M. Handley, and E. Schooler, “SIP: Session Initiation Protocol”, June 2002). SIP was adopted by the 3GPP in 2001. For security, H.323 requires the use of H.235 while SIP has been primarily open and allowed VoIP implementers to use cryptography methods such as SRTP, IPSec, and custom cryptography methods (ITU-T H.235.1 Standard, “H.323 security framework: Baseline security profile,” September 2005, IETF RFC 3711: M. Baugher, D. McGrew, M. Naslund, E. Carrara, and K. Norrman, “The Secure Real-time Transport Protocol (SRTP),” March 2004). Additionally, SIP uses a text configuration which makes implementing SIP easy to understand, debug, and modify to meet customer's needs.
In speech applications, a CODEC's performance is measured by its Mean Opinion Score (MOS). The MOS method makes use of a combined average of subjective listeners. In general this method does not always lead to the same result but it can determine the difference between various CODEC performances. To increase the repeatability of measuring the MOS, the ITU-T came out with several standards with the most recent being ITU-T P.862 the perceptual evaluation of speech quality (PESQ) (ITU-T P.862, Perceptual evaluation of speech quality (PESQ): An objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs, Amendment 2, November 2005). While originally developed for narrowband speech it has been extended to wideband speech in P.862.2. In addition to the MOS measurements, to account for various other factors inherent to voice over Internet protocol (VoIP), the ITU-T developed G.107 the E-model which takes into account impairments from various sources such as delay (ITU-T G.107: The E-model, a computational model for use in transmission planning, August 2008). The E-model yields a better prediction for quality of service (QoS) for VoIP.
Narrowband speech is primarily dominated by three different CODECs in industry. They are the adaptive multi-rate (AMR) codec for GSM, the EVRC-B codec from Qualcomm for 1xEVDO, and Speex for open source VoIP applications (3GPP TS 26.104: ANSI-C code for the floating-point Adaptive Multi-Rate (AMR) speech codec, 3GPP2 TSG-C C.R0018-C v1.0: Software Distribution for Enhanced Variable Rate Codec (EVRC), Speech Service Options 3, 68, and 70, Minimum Performance Specification January 2008, Speex: a free codec for free speech). AMR implements a discontinuous transmission method to achieve variable rate transmission. Additional rate adjustments must be made prior to using their encoder. The EVRC-B code achieves variable rate transmission by using a discontinuous transmission method like AMR. The Speex codec uses a different approach to achieve variable rate transmission. Speex bases its quantization on the actual speech which allows it to achieve better performance in variable data rate (VDR) applications. In the new ITU-T standard G.729.1, a scalable variable data rate codec is described that is compatible with G.729 but can assist in Quality of Service through adaptive bit rates after transmission at intermediate nodes (S. Ragot, et al., “ITU-T G.729.1: An 8-32 kbit/w scalable coder interoperable with G.729 for wideband telephony and voice over IP,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 4, pp. 529-532, Apr. 15-20, 2007).
Variable data rate voice compression technology is dominated by several techniques. The first technique relies on voice activity detection (VAD) and is employed in modern day cellular and voice over Internet protocol (VoIP) systems. The main voice compression CODECs used in modern systems are the adaptive multi-rate (AMR) codec for GSM, the EVRC-B codec from Qualcomm for 1xEVDO, and Speex for open source VoIP applications (3GPP TS 26.104: ANSI-C code for the floating-point Adaptive Multi-Rate (AMR) speech codec, 3GPP2 TSG-C C.R0018-C v1.0: Software Distribution for Enhanced Variable Rate Codec (EVRC), Speech Service Options 3, 68, and 70, Minimum Performance Specification January 2008, Speex: a free codec for free speech). AMR implements a discontinuous transmission method to achieve variable rate transmission. Additional rate adjustments must be made prior to using their encoder. The EVRC-B code achieves variable rate transmission by using a discontinuous transmission method like the AMR codec. The Speex codec uses a different approach to achieve variable rate transmission. Speex bases its quantization on the actual speech which allows it to achieve better performance in variable data rate (VDR) applications. In the new ITU-T standard G.729.1, a scalable variable data rate codec is described that is compatible with G.729 but can assist in Quality of Service through adaptive bit rates after transmission at intermediate nodes (S. Ragot, et al., “ITU-T G.729.1: An 8-32 kbit/w scalable coder interoperable with G.729 for wideband telephony and voice over IP,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 4, pp. 529-532, Apr. 15-20, 2007).
Variable data rate communication is relatively new for sensor technology and voice communication. Until recently the majority of communication systems were designed for fixed bandwidth applications. Migrating to modern variable data rate communication systems has improved signal-to-noise ratio (SNR) of signals, Mean Opinion Score (MOS), decreased the outage probability, and increased the channel capacity of the communication links and networks.
Sensor networks are becoming common place with the decreasing cost and power requirements. These networks allow multiple types of information to be transmitted at various transmission rates. Newer systems allow feedback that can increase the efficiency of the system. One example might be to make more efficient use of water resources by decreasing water runoff through smart watering systems which protect from over-watering and under-watering.
Recently, there have been several efforts to implement security for VoIP but all these methods do not implement security efficiently. These methods increase bandwidth by viewing security as a blanket without knowledge of the underlying data being transmitted. Our goal is to develop security methodologies for VoIP which take into consideration the limited available bandwidth of Narrowband network technologies.
The primary challenge with implementing Secure Variable Data Rate (SVDR) implementations is minimizing the overhead due to adding security for variable data rate digital communications. Traditional techniques for implementing secure digital communications consist of padding the data of size (1) with size (p) for encryption and transmitting the entire encrypted data of size (I+p) with additional overhead due to Medium Access Control (MAC) header, Internet Protocol (IP) packet header, User Datagram Protocol (UDP) packet header, and optional Real-time Transport Protocol (RTP) packet header. Newer secure streaming media methods such as Secure Real-time Transport Protocol (SRTP) make use of the RTP header for determining the initialization vector for decrypting using segmented counter mode or f8 mode.
The present invention focuses on improving the bandwidth efficiency of secure variable data rate communication. While there exist several ways to implement secure digital communication and several ways to implement secure variable data rate digital communication, additional bandwidth efficiency can be gained if more bandwidth efficient secure variable data rate digital communication methods are used. What is needed is a systematic method for implementing secure variable data rate digital communications that reduces bandwidth overhead.