The invention relates to controlling voice communications over a data network.
Data networks are widely used to link various types of network elements, such as personal computers, servers, gateways, network telephones, and so forth. Data networks may include private networks (such a local area networks or wide area networks) and public networks (such as the Internet). Popular forms of communications between network elements across such data networks include electronic mail, file transfer, web browsing, and other exchanges of digital data.
With the increased capacity and reliability of data networks, voice communications (including telephone calls, video conferencing, and so forth) over data networks have become possible. Voice communications over data networks are unlike voice communications in a conventional public switched telephone network (PSTN), which provides users with dedicated, end-to-end circuit connections for the duration of each call. Communications over data networks, such as IP (Internet Protocol) networks, are performed using packets or datagrams that are sent in bursts from a source to one or more destination nodes. Voice data sent over a data network typically shares network bandwidth with conventional non-voice data (e.g., data associated with electronic mail, file transfer, web access, and other traffic).
Various standards have been proposed for voice and multimedia communications over data networks. One such standard is the H.323 Recommendation from the International Telecommunications Union (ITU), which describes terminals, equipment, and services for multimedia communications over data networks.
Another standard for voice and multimedia communications is the Session Initiation Protocol (SIP), which establishes, maintains, and terminates multimedia sessions over a data network. SIP is part of a multimedia data and control architecture developed by the Internet Engineering Task Force (IETF). The IETF multimedia data and control architecture also includes the Resource Reservation Protocol (RSVP) for reserving network resources; the Real-Time Transport Protocol (RTP) for transporting real-time data and providing quality of service (QoS) feedback; the Real-Time Streaming Protocol (RTSP) for controlling delivery of streaming media; the Session Announcement Protocol (SAP) for advertising multimedia sessions by multicast; and the Session Description Protocol (SDP) for describing multimedia sessions.
To perform voice communications over a data network, a typical computer system (such as a desktop computer system or a portable computer system) may be equipped with voice processing capabilities. Such capabilities include a microphone, ear phones or speakers, and speech processing software. Typically, the speech processing software includes coder/decoders (CODECs) to encode and decode voice data. The voice processing software, including the CODECs, may be run on a microprocessor of a typical computer system. However, due to the intensive data processing typically required to process voice data, speech performance may not be optimum. For example, there may be delays associated with the transfer of such voice data due to the amount of time needed to process the voice data. Also, if certain types of CODECs that have less resource requirements are selected, voice quality may suffer.
Also, the computer system needs to be fitted with speakers, microphones, and sound cards to enable speech processing. Further, such speakers, microphones, and sound cards may not provide the desired level of quality, or if they do, may be relatively expensive. Additionally, to add such speech processing components to a computer system may require some configuration to be performed by a user, a process that an unsophisticated user may have difficulty with.
Unless a computer system with powerful processing capabilities are provided, the voice quality provided by such computer systems are not at the level typically experienced (and expected) by users of standard telephones. Such “standard” telephones may include analog telephones coupled to a local or central switching office or digital telephones coupled to a private branch exchange (PBX) system. More recently, network telephones have been developed that are capable of being connected directly to a data network, such as an IP network. These network telephones are capable of placing telephony calls over a data network. The voice quality offered by such telephones are typically superior to those that can be offered by computer systems, since such network telephones typically include dedicated digital signal processors (DSPs) that perform the data intensive calculations involved in speech processing. However, the existing network telephones do not provide desired multimedia presentation capabilities such as those offered by displays of computer systems. Thus, while network telephones offer superior speech capabilities, it does have the desired multimedia capabilities. On the other hand, computer systems have superior multimedia capabilities, but they suffer from relatively poor speech processing performance.
A need thus exists for an improved method and apparatus for controlling voice communications over data networks.