1. Field of the Invention
The present invention relates to the field of telecommunications and, more particularly, to an enhanced messaging protocol for controlling media service resources.
2. Description of the Related Art
The Media Resource Control Protocol (MRCP) is a protocol for interfacing with media resources developed by the Internet Engineering Task Force (IETF). The MRCP is designed to provide a mechanism for a client device requiring audio/video stream processing to control processing resources on a network. These media processing resources can include a speech recognizer, such as an automatic speech recognition (ASR) engine, a speech synthesizer, such as a text-to-speech (TTS) engine, a fax, a signal detector, and the like. Further, MRCP allows media processing resources to be utilized by a remotely located system, such as an Interactive Voice Response (IVR) system. a telephone application server, and/or a voice server.
The message format for MRCP is text based with mechanisms to carry embedded binary data. This allows data recognition grammars, recognition results, synthesizer speech markup, and the like to be carried within MRCP messages conveyed between a client and the media resource server.
MRCP addresses the issue of controlling and communicating with the resource processing the stream, and defines the requests, responses, and events needed to do that. The MRCP protocol does not, however, address session control management, media management, reliable sequencing and delivery or server or resource addressing, which are to be handled separately by a protocol like Session Initiation Protocol (SIP) or Real Time Streaming Protocol (RTSP).
Turning to specifics of the MRCP, an MRCP message consists of a start-line, one or more header fields, an empty line indicating the end of the header fields, and an optional message body. An empty line can be represented by a carriage return line feed (CRLF).
generic-message =start-linemessage-headerCRLF[ message-body ]
The start line can include a request-line, a response-line, and/or an event line. Accordingly, the start line can identify the type of message contained within the MRCP message. The three possible types of messages can include a request message, a response message, and/or an event message.
start-line =request-line | response-line | event-line
A request message can be conveyed from a client to a server. The request message can include the name of a method to be applied, a space (SP) used as a field separator, a method tag for a request, another SP, and a version of the MRPC protocol in use.
request-line =method-name SP request-id SPmrpc-version CRLF
After receiving and interpreting a request message, a server resource can respond to the client with a response message. The response message can include a version of the MRPC protocol running on the server, a SP, a request-id that must match that sent in the corresponding request message, a SP, a status-code representing the success, failure, or other of the request, a SP, and a request state field indicating if a job is pending, in-process, or complete.
response-line =mrcp-version SP request-id SPstatus-code SP quest-state CRLF
When a server resource needs to communicate a change of state or an occurrence of an event to a client, the server can generate an event message. The event message can include an event name identifying the nature of the event generated by the media resource, a SP, a request-id that matches that sent in the request that caused the event, a SP, a request-state, a SP, and an mrcp-version.
event-line =event-name SP request-id SPrequest-state SP mrcp-version CRLF
The message header can include one or more general headers and one or more resource specific headers, where resource specific headers can include request headers and response headers.
message-header =1*(generic-header | resource-header)
Each header consists of a field name followed by a colon and an optional field value, where the field name is a token and the field value includes field content that does not include any leading or trailing linear white spaces (LWS). combinations of token, separators,
header =field-name “:” [ field-value ]field-name =tokenfield-value =*( field-content | LWS )field-content =<the OCTETs making up the field-valueand consisting of either *TEXT orcombinations of token, separators, andquoted-string>
When used to as a communication protocol between a telephone application server and/or a componentized voice server, the MRCP has numerous shortcomings. First, MRCP messages do not include information sufficient to reference MRCP messages back to associated telephone calls. Reference back information can be useful to enable end-to-end call tracing features, which can be highly beneficial when conducting debugging operations. Additionally, access to call identity permits a speech engine or other media processing resource to reference call information from call-specific information data stores, such as data stores established by a telephone gateway, a telephone application server, and/or a voice server.
Another shortcoming of the MRCP when used in a telephone application server context is that the MRCP does not include information sufficient to link MRCP messages back to a media gateway, such as a media converting component of a telephone voice server. The MRCP specification includes identification of audio input and output with a focus on a one to one allocation between calls and speech engines.
In other words, the input/output parameters provided by the MRCP focuses on allocating one media resource per call. Once allocated, the resource is occupied for the duration of a call. This type of allocation can be referred to as call-based engine allocation, which can be highly inefficient as it fails to maximize the utilization of speech engines, such as ASR engines and TTS engines.
Cost effective telephony solutions do not allocate speech engines for an entire call. Rather, a speech engine is allocated for a turn of speech, where each turn represents a discrete speech request or work unit that a speech engine is to process. Because MRCP does not specify media sources within messages, utilizing the MRCP protocol for turn-based speech engine allocation can be problematic and establishment of proper communication channels for the dynamically allocated speech engines can result in processing delays.