1. Field of the Invention
The present invention relates to identifying data communications over the Internet. More particularly, the invention relates to a method and apparatus for creating and using unique session identifiers for identifying individual data communications sessions between one apparatus and another.
2. Background
Security on the Internet is important to ensure the integrity of business transactions and the transfer of confidential information. Since existing means of personal identification, such as visual appearance and written signatures, are not exactly transferable to Internet transactions, new digital methods of identification must be employed. These new methods must not only provide positive identification, they must themselves be secure to prevent interlopers from misappropriating the identifying information.
Such identification schemes must comport themselves with protocols for Internet data communications that are in existence. For example, the Internet e-mail protocol described in RFC 822, published under the auspices of the Internet Architecture Board, dictates that binary data should not be sent as eight-bit code. That is, the most significant bit (MSB) of each byte of transferred data must have a xe2x80x9c0xe2x80x9d value or else transmission errors may occur. Common schemes for addressing this issue include transmission as seven-bit ASCII code, base64 encoding, universal resource locator (URL)-encoding, and hex encoding. These methods, in turn, are limited by considerations such as character compatibility with the underlying message and decoding scheme, bandwidth and data storage requirements, and limitations imposed at the application level. A further consideration is the computational ease of encoding and decoding, e.g. powers of two encoding such as base64 can use shift/and operations while non-powers of two encoding such as base62 encoding must use division/modulus operations.
Identification schemes may also take advantage of existing data transmission methods. Form submission is commonly used to send information from one apparatus or computer to another apparatus or computer. The first computer provides the second computer with on-screen buttons and dialog boxes with which the user of the second computer can enter data. After the data is entered into the second computer, the data is encoded for transmission and sent to the first computer. If the data is relatively short, it may be directly appended to the URL in the header of the message to the first computer, separated by a xe2x80x9c?xe2x80x9d. Data following the xe2x80x9c?xe2x80x9d is known as the query string, which is often limited in length because of the input buffer size of many servers. This method is known as GET mode. In an alternative method known as POST mode, longer data is sent in the body of the message to the first computer. Since information sent via either the GET or the POST method is usually primarily text, these transmissions are typically URL-encoded.
Data transmission sessions between computers may use GET or POST transmitted data to identify a particular data communications session. For example, an external computer or client may submit its identifying information by GET mode, and the URL-encoded identifying information may be appended to the URL for the duration of the session. This URL-encoded identifying information is then passed between the computers for the duration of the session.
There are shortcomings to this technique. For example, when identifying information is not adequately modified before being used to identify a session, the identification may not be unique to a session. In this case, if two computers submit identical identifying information during overlapping sessions resulting in identical URL-encoded identifying information, a host computer or server will not be able to differentiate between external computers.
Another shortcoming is that URL-encoding is inefficient for non-text characters. While letters and digits are encoded with one byte per character, other characters require three bytes. Thus, if the URL-encoded identifying information is not almost exclusively characters, it will require extra bandwidth and storage capacity. In some cases, URL-encoding may undesirably truncate or otherwise limit the identifying information.
The method and apparatus of this invention overcome these shortcomings.
A method and apparatus for using a session identifier to identify a specific data communications session between an apparatus and an external apparatus is disclosed. When a data communications session is initiated between the apparatus and an external apparatus, the external apparatus sends authenticating information to the apparatus. The apparatus uses the authenticating information to determine the identity and the privileges of the external apparatus for the particular session. A unique session identifier is created by the apparatus, and the session identifier is associated with the external apparatus""s identity and privileges. The session identifier is passed between the apparatus and the external apparatus with each subsequent data communication in the session until the session is terminated. The apparatus uses the session identifier received with the data communications to identify the external apparatus and its privileges and allocate resources accordingly. The session identifier is encoded using a six bit code, thereby making it compatible with the Internet e-mail protocol, while also optimizing data compression. The encoded session identifier may be transmitted by appending it to a URL like a query string.