Voice calls over packet-switched networks are performed using a codec at a transmitting device to create encoded packets that represent a user's voice. The encoded packets are forwarded over the packet-switched network and decoded at a receiving device. Examples of transmitting and receiving devices comprise a mobile phone, smart phone, Voice over Internet Protocol (VoIP) terminal, personal computer, or other telephony devices. In one implementation, a mobile phone encodes a calling party's voice into a plurality of encoded packets. The mobile phone sends the encoded packets to the packet-switched network as a voice stream (e.g., a stream of packets). The packet-switched network forwards the voice stream to a mobile phone of a called party where the encoded packets are decoded for playback to the called party. Analogously, the called party's voice is encoded and forwarded to the calling party for playback.
In this implementation, the mobile phone employs an Enhanced Variable Rate Codec (EVRC) standard to create encoded packets from the user's voice. Each encoded packet represents a 20 millisecond sample of the user's voice and/or background noise. Based on the sample, the encoded packet is created at one code rate of a plurality of predefined code rates and associated sizes, which are defined by the EVRC standard. Examples of code rates in EVRC comprise full rate, half rate, and eighth rate, which have packet sizes of 171 bits, 80 bits, and 16 bits, respectively. Encoded packets or frames that are encoded at the eighth rate (e.g., eighth rate frames, eighth rate packets, or rate ⅛ frames) are generally used for samples that are predominantly background noise since they have a smaller frame size, use fewer network resources to be transmitted, and the background noise is not necessary for conversation between the users, as will be understood by those skilled in the art.
When the calling party is generally silent, such as when listening to the called party or during a pause in conversation, eighth rate frames are encoded by the transmitting device and forwarded to the called party. Since background noise is generally not an important part of the conversation, some of the eighth rate frames can be removed from the voice stream for transmission over the packet-switched network and replaced or substituted with another eighth rate frame before playback at the receiving device. A complete elimination of background noise from the voice stream sounds to the called party as if the call has been dropped or otherwise ended.