Certain embodiments of the present invention relate to the encoding of signals in communication systems. More specifically, certain embodiments relate to a method and system for allocating memory during encoding of a datastream.
Packet based telephony such as Internet Protocol (IP) telephony may provide an alternative to conventional circuit switched telephony, the latter of which may typically require the establishment of an end-to-end communication path prior to the transmission of information. In particular, IP telephony permits packetization, prioritization and simultaneous transmission of voice, video and data traffic without requiring the establishment of an end-to-end communication path. IP telephony systems may capitalize on voice over packet (VoP) technologies, which may provide a means by which voice, video and data traffic may be simultaneously transmitted across one or more packet networks.
Voice quality (VQ) may define a qualitative and/or quantitative measure regarding the quality and/or condition of a received voice signal. Voice clarity may be an indicator of the quality or condition of a voice signal. Voice quality may be an important parameter that may ultimately dictate a quality of service (QOS) offered by a network service provider. The following factors, for example, may affect the voice quality and/or condition of a voice signal—noise, echo, and delay or packet latency. However, the effects of these factors may be cumulative. In this regard, factors such as delay and latency may exacerbate the effects of echo. Delays that may affect the voice quality may include, but are not limited to, routing, queuing and processing delays.
Various VoP specifications, recommendations and standards have been created to ensure interoperability between various network components, and to create an acceptable QOS, which may include voice quality. For example, the International Telecommunications Union (ITU) ratified H.323 specification, which defines the processes by which voice, video and data may be transported over IP networks for use in VoIP networks. H.323 addresses, for example, delay by providing a prioritization scheme in which delay sensitive traffic may be given processing priority over less delay sensitive traffic. For example, voice and video may be given priority over other forms of data traffic.
H.323 also addresses voice quality by specifying the audio and video coders/decoders (CODECs) that may be utilized for processing a media stream. A CODEC may be a signal processor such as a digital signal processor (DSP) that may be adapted to convert an analog voice and/or video signal into a digital media stream and for converting a digital media stream into an analog voice and/or video signal. In this regard, a coder or encoder portion of the CODEC may convert an analog voice and/or video signal into a digital media stream. Additionally, a decoder portion of the CODEC may convert a digital media stream into an analog voice and/or video signal. Regarding the CODEC for audio signals, H.323 may support recommendations such as ITU-T G.711, G.722, G.723.1, G.728 and G.729 recommendations. ITU-T G.711 may support audio coding at 64 Kbps, G.722 may support audio coding at 64 Kbps, 56 kbps and 48 Kbps, G.723.1 may support audio coding at 5.3 Kbps and 6.3 Kbps, G.728 may support audio coding at 16 Kbps and G.729 may support audio coding at 8 Kbps.
The voice quality of a speech CODEC may be dependent on factors such as the type of encoding and/or decoding algorithm utilized by the CODEC. In general, some CODECs may utilize compression algorithms that remove redundant information from the analog signal. Such compression algorithms may permit at least a close replication of an original analog signal. In this case, the bandwidth required for transmitting any resultant signal may be reduced. Other CODECs may utilize algorithms that analyze the signal and retain only those portions that are deemed to be of cognitive importance. These algorithms may reproduce a close approximation to the original signal. Notwithstanding, in this latter case, bandwidth utilization may be superior to the former case where redundant information may be removed. Accordingly, depending on application requirements and hardware limitations, one or more algorithms may be utilized to optimize performance.
Moreover, although economic attractiveness of VoP have lured network access providers and network transport providers away from traditional circuit switching networks, factors such as the extensiveness of embedded legacy systems and customer demands, for example, have dictated the coexistence of both packet switched and circuit switch networks. Accordingly, new technologies and techniques such as audio coding and decoding may be required to support various modes of operation utilized by each system.
Encoding an audio, video or data signal typically requires two main types of memory allocation. These may include an encoder instance memory allocation and a compressed voice or data memory allocation. The encoder instance memory may generally contain the state of an encoder, which may include information such as power levels, signal estimates, history, and filter coefficients. Complex encoders such as a G.729 encoder may contain more information in the instance memory than a simple encoder such as G.711. A simple encoder such as a G.711 encoder generally contains compander mode logic and optional packet loss concealment logic. In order to achieve compression, a plurality of basic packets may be encoded to create a compressed packet. In this manner, the size of a compressed packet is a multiple of the size of a basic packet. The larger compressed packet may be called a super-packet. For example, G.711 typically utilizes a basic 40 byte voice payload frame, which may represent five (5) milliseconds (ms) of speech or data. By compressing or chaining a plurality of G.711 basic voice payload frames, various sizes of super-packets may be created. For example, by chaining 2, 3, or 4 G.711 basic voice payloads, super-packets having payloads of 80, 120, or 160 bytes respectively, may be created. These may represent 10, 15, or 20 ms of speech or data respectively.
Basic payload bytes for other encoding schemes may also be compressed to create super packets. For example, G.729 may utilize a basic 10 byte voice payload frame, which may represent ten (10) milliseconds (ms) of speech or data. By compressing or chaining a plurality of G.729 basic voice payload frames, various sizes of super-packets may be created. For example, by chaining 2, 3, or 4 G.711 basic voice payloads, super-packets having payloads of 20, 30, or 40 bytes respectively, may be created. These may represent 20, 30, or 40 ms of speech or data respectively.
In broadband communication and/or high speed communication systems, for example, which may utilize various high density speech processing systems, many channels may be simultaneously active. Since memory is required for tasks such as the simultaneous encoding and decoding these channels, memory is a major system resource that requires optimal handling. Typically, for a range of voice compression algorithms, the more highly compressed the data, the greater the required amount of operating data and the greater the required amount of operating code. Furthermore, a lower bit rate algorithm requires a smaller packet buffer or memory to store a given duration of data. Existing decoders generally allocate a worst-case amount of memory to accommodate voice compression algorithm operating data and/or to accommodate operating code. Additionally, a worst-case amount of memory may also be allocated for packet buffering. In general, scenarios in which these worst-case allocations are applicable do not occur simultaneously. Accordingly, existing memory allocation schemes may not provide the most optimal solutions for allocating memory.
Further limitations and disadvantages of conventional and traditional approaches will become apparent to one of skill in the art, through comparison of such systems with some aspects of the present invention as set forth in the remainder of the present application with reference to the drawings.