One foundational session control protocol is becoming an emerging workload in the telecom Next-Generation-Network (NGN) and IT collaborative solution. SIP is one text-based message protocol. It operates independently of the underlying network transport protocols, establishing sessions between multiple users irrespective of whether the transferred data is text data, audio data, or video data. In the SIP protocol stack, however, some computation-intensive operations, such as token parsing and security processing, will occupy a large amount of CPU cycles. As SIP-based applications are becoming popular, these operations could be potential performance bottlenecks for SIP servers, such as proxy servers or application servers.
To address this, SIP Offload Engine (SOE) architecture is proposed. As shown in FIG. 1, a front end 110 parses a SIP message, binarizes it, and generates an “SIP Offload Engine (SOE) message”, abbreviated as SOE message hereinbelow. The objective of applying such offload technology is to offload the computation-intensive operations from the server end to some special appliances, such as front ends. In particular, the front end will parse the tokens in the SIP message, and transform the text-based message to a binary SOE message, and then the server will parse the SOE message. The term “token” is defined as an indecomposable part provided to an upper-layer logic through an interface, which is a character string separated by separators, such as semicolons, spaces. Thus, at server end more CPU cycles may be freed up for upper-layer applications to improve the overall performance.
The SIP protocol enables end users to communicate with each other via messages. The basic form of a message could either be a request sent from a client to a server or a reply from the server to the client. A message consists of a start-line, one or more header fields, a null line indicating the end of the header fields, and an optional message-body. The generic structure of an SIP message is shown as below:
generic-message = start-line    message header field 1    message header field 2    . . .    . . .    CRLF    message-body [optional]    start-line = Request-Line/Status-Line
1. SIP Request Message
A request may be recognized by the presence of a Request-Line as the start-line. The format of a request-line is shown as below:                Request-Line=Method SP Request-URI SP SIP-Version CRLF        
A method is an action associated with a session between end users. The examples of a method comprise: REGISTER, INVITE, OPTIONS, ACK, CANCEL, BYE, defined in RFC3261 specification; and other methods defined in other separate RFC specifications. The Request-URI is the recipient of the SIP message. The SIP Version is currently SIP/2.0 and is to be included in all messages. The CRLF terminates the Request-Line.
2. SIP Response Message
A response may be recognized by the presence of a Status-Line as the start-line. The format of a status-line is shown as below:                Status-Line=SIP-Version SP Status-code SP Reason-Phrase CRLF        
The Status-Code represents the result of the action taken due to the request. The result of a request is categorized below:
(a) 100-199: A request was received, processed in progress.
(b) 200-299: The request was received, understood, and accepted.
(c) 300-399: Further action needs to be taken to complete the processing of the request.
(d) 400-499: The request cannot be processed at the server, possibly due to bad syntax.
(e) 500-599: The server failed to process the request. The request could have been invalid.
(f) 600-699: Global failure. The request cannot be processed by any server. The Reason-Phrase is an English-like equivalent of the Status-Code. For example, for Status-Code 200, the Reason-Phrase is “OK”.
Both the Request/Response messages may have multiple message headers. These SIP header fields form a part of the SIP message. Each header conveys some information for the destination. The format of an SIP message header is shown as below:                field-name: [field-value]        
It is noted that the field value could extend over multiple lines.
The type of a header field can be thought of to be based on the function performed by that header. 44 types of headers are defined in RFC3261 specification. The major header types comprise, but not limit to:
1. Originator fields: From, To
2. Routing fields: Via
3. Authentication: Proxy-Authenticate
It can be seen from above that an SIP message has the following three features: a) a large number of token values with variable lengths; b) line-by-line structure; and c) multiple tokens in each line. Therefore, how to binarize SIP messages is critical for the implementation of the offload technology.
As one of the existing approaches, ASN.1 can be used for accommodating the token information in a way of <Type, Length, Value> (TLV). But this TLV approach is not efficient since most of the values in an SIP message are strings with variable lengths, then the parser will have to go through the whole message to get the information needed.
Another existing approach is to allocate a fixed position for each token. But this approach also has multiple defects. First, the storage efficiency is affected, as there will be waste storage space between tokens with different lengths. Second, the blank storage space must be skipped while processing messages, which also affects the processing efficiency. Third, there is no sufficient space reserved for “optional” tokens.
Therefore, there is a need for an approach to binarize an SIP message efficiently.