1. Field of the Invention
The present invention relates handling Real-time Transport Protocol (RTP) media packets in a Voice over Internet Protocol (VoIP) terminal, and more particularly, to an apparatus and method of handling RTP media packets in a VoIP terminal, allowing network resources to be more efficiently used in handling voice packets and transmitting the voice packets to a correspondent at either a VoIP wired terminal or a Voice over Wireless LAN (VoWLAN) terminal using Wireless LAN (WLAN) and a VoIP technology.
2. Description of the Related Art
In delivering video, voice, and facsimile messages over the Internet, a Voice over Internet Protocol (VoIP) system transmits real-time media such as voice and video when a user desiring to use the Internet gains access to the Internet by using a Personal Computer (PC), by using any independent device with Internet Protocol applied thereto, or by making a call toward a gateway with an existing Public service Telephone Network (PSTN) phone.
The VoIP system is used because it has the following advantages.
First, integration of a telephone network and a data network reduces investment cost for network equipment. A telephone network for voice communication and a data network for data communication are not disposed separately, such that the investment cost for network equipment is saved. Second, the integrated network reduces management cost and improves efficiency. By handling data and voice with one network, VoIP provides two advantages of management cost reduction and efficiency improvement, unlike an existing network in which data and voice are handled by distinct networks. Third, VoIP is easy to work with Internet-based multimedia services. Using the same network for voice and data makes it possible to provide a number of additional services, such as video conferencing, which are difficult to provide in conventional telephones that adopt circuit switching.
To provide the VoIP service, there is a need for means of discovering and signaling a correspondent to communicate. VoIP signaling includes H.323 of ITU-T and Session Initiation Protocol (SIP) of IETF.
A number of H.323-based VoIP services have been developed. SIP facilitates parsing and compiling tasks as well as provides excellent extensibility. Further, SIP is text-based and thus is easy to implement, unlike H.323.
VoIP end-point devices, e.g., gateways, IP phones, PCs, and the like perform voice communication by continuously transmitting and receiving RTP packets as packetized voice between a sender and a recipient through the RTP protocol on an IP network. However, the consistent RTP packet transmission and reception causes a traffic load on the IP network and impacts overall performance of VoIP equipment by processing RTP packets at a socket interface of a media processor.
The RTP allows an end-to-end transmission service in which real-time data such as audio and video is forwarded using a multicast or unicast network. The RTP has no concept of a connection. Typically, the RTP operates on an upper layer of a User Datagram Protocol (UDP) and utilizes multiplexing and checksum services of the UDP.
In addition to a typical wired VoIP system, Voice over Wireless LAN (VoWLAN) technology, in which voice is forwarded using a widespread WLAN, is recently emerging as a new mobile telephone technology. This is because the VoWLAN realizes lower fees and greater convenience by adding mobility to a wired Internet telephone, i.e., a VoIP telephone.
The VoWLAN forwards voice over a wireless LAN network. In other words, the VoWLAN uses the wireless LAN as a medium, unlike an existing Internet phone working on a wired network.
The VoWLAN provides convenient voice communication by guaranteeing mobility within an area of an Access Point (AP). Further, using an existing network considerably reduces telephony cost as compared with a telephone circuit provider. In particular, the VoWLAN enables a video telephone service desired by customers, and therefore, is advantageous for future customer services.
In VoIP, since voice data is all formed as an RTP packet and is continuously forwarded over a data network, a predetermined network bandwidth is required to perform smooth communication.
To efficiently use network bandwidth in the VoIP system, there is a method with silence that is a characteristic of voice conversation, and an RTP multi-framing method in which voice data is multiplexed into one RTP packet.
The method with silence utilizes a silence processing scheme, such as silence suppression or Voice Activity Detection (VAD)/Comfort Noise Generation (CNG). Typically, a VoIP Digital Signal Processor (DSP) has a VAD/CNG function. When silence is detected by this function, a normal voice payload is not generated but a smaller silence payload indicating the silence is generated. This is sent to the correspondent via the RTP and a local noise is sent to the correspondent during a silence period, such that network bandwidth is saved and smooth communication is maintained.
The multi-framing method is described below.
In VoIP communication, a voice payload periodically generated by a VoIP DSP is formed as an RTP packet and transmitted to a correspondent. To form the RTP packet, protocol header information for transmission, such as an Ethernet header, an IP header, a UDP header, an RTP header, and the like, as well as the voice payload, are added to every RTP packet. This increases the size of actually transmitted data and requires additional bandwidth.
The RTP multi-framing method multiplexes a number of voice payloads into one RTP packet within a predetermined limit and transmits the RTP packet, instead of forming and transmitting the RTP packet directly after a voice payload is generated. This reduces the quantity of additional protocol header information for transmission, such that an entire required network bandwidth is decreased.
The silence-using method and the multi-framing method consume considerable processing time in a terminal or cause a delay while two or three voice packets are being sequentially accumulated, thus deteriorating voice quality. In addition, if a multi-framed RTP packet is lost, a multi-framed amount of the voice packet can be lost all at once, further deteriorating voice quality.
The above problem associated with the VoWLAN terminal is described below in greater detail.
In the VoWLAN phone, battery duration is critical because the VoWLAN phone is a wireless communication terminal.
The silence method requires a consistent processing time because an RTP packet is formed and transmitted over a network by the VoIP technology directly after packetized media data is generated during communication. This causes battery power of the VoWLAN phone to be more rapidly consumed during communication.
In the multi-framing method, generated voice packets are sequentially accumulated by a few frames and then transmitted. This causes voice packet delivery to be delayed, as well as losing a large amount of voice data at once when the packet is lost, resulting in voice quality degradation.