The invention relates to the field of processing. More specifically, the invention relates to an interface for a security coprocessor.
Communication networks and the number of users of such networks continue to increase. Moreover, on-line sales involving both business-to-business and business to consumer over the Internet continues to proliferate. Additionally, the number of people that are telecommuting continues to grow. Both on-line sales and telecommuting are examples of usage of communication networks that typically involve private and sensitive data that needs to be protected during its transmission across the different communication networks.
Accordingly, security protocols (e.g., Transport Layer Security (TLS), Secure Sockets Layer (SSL) 3.0, Internet Protocol Security (IPSec), etc.) have been developed to establish secure sessions between remote systems. These security protocols provide a method for remote systems to establish a secure session through message exchange and calculations, thereby allowing sensitive data being transmitted across the different communication networks to remain secure and untampered.
FIG. 1 illustrates a two phase client/server exchange to establish a secure session. In a first phase 105, the security negotiation phase, a network element 101 (the client) and a network element 103 (the server) exchange messages to negotiate security between the two network elements 101 and 103. The negotiation of security includes determining the algorithms (e.g., hashing algorithms, encryption algorithms, compression algorithms, etc.) to be employed by the two network elements 101 and 103. In a second phase 107, a key exchange phase, the network elements 101 and 103 exchange key information. The second phase 107 comprises the network elements 101 and 103 exchanging messages based on a selected public key algorithm and authenticating received messages. While the specific primitive tasks of these two phases vary for different security protocols, the primitive tasks for establishing a secure session can include the receiving of messages, transmitting of messages, generating of keys, generating of secrets, hashing of data, encrypting of data, decrypting of data, and calculating of random numbers.
Performing the tasks to establish a secure session is processor intensive. If a general purpose processor, acting as the host processor for a network element, performs these tasks, then the network element""s system performance will suffer because resources will be consumed for the tasks. The results of poor system performance can impact a network and users in various ways depending on the function of the network element (e.g., routing, switching, serving, managing networked storage, etc.).
Coprocessors have been developed to offload some of the tasks from the host processor. Some coprocessors have been developed to perform a specific primitive task for the host processor (e.g., hash data). The addition of a task specific coprocessor does not offload from the host processor a significant amount of the secure session establishment tasks. One alternative is to add multiple coprocessors to a network element, each performing a different task. Such an alternative is limited by physical constraints (e.g., number of slots to connect cards) and introduces the problem of multiple communications between the host processor and the multiple coprocessors.
Other coprocessors have been developed to perform more than one of the tasks required to establish a secure session. Assume a coprocessor can perform a cryptographic operation (i.e., an encrypt or decrypt), a key material generation operation, and a hash operation. For example, assume a server has received a request to establish an SSL 3.0 session. The server must call the coprocessor to decrypt a pre-master secret received from a client. To generate a master secret and key material, the host processor must make 20 calls to the coprocessor (one for each hash operation). In just the beginning of establishing a single secure session, the host processor has made 21 calls to the multiple task coprocessor. As illustrated by this example, a coprocessor that can perform multiple tasks does not solve the issue of resource consumption from multiple communications between the host processor and the coprocessor.
Despite the addition of these coprocessors, a large amount of resources are still consumed with establishing secure sessions. Establishment of a secure session may suffer from latency caused by multiple communications between the host processor and a multiple task coprocessor or multiple single task coprocessors. Multiple communications between the CPU and coprocessors consumes system resources (e.g., bus resources, memory resources, clock cycles, etc.). The impact to the system can include limitation of 1) the number of secure sessions which can be served and 2) the number of concurrent secure sessions that can be maintained by the system.
A method and apparatus for processing security operations are described. In one embodiment, a processor includes a number of execution units to process a number of requests for security operations. The number of execution units are to output the results of the number of requests to a number of output data structures associated with the number of requests within a remote memory based on pointers stored in the number of requests. The number of execution units can output the results in an order that is different from the order of the requests queue. The processor also includes a request unit coupled to the number of execution units. The request unit is to retrieve a portion of the number of requests from the request queue within the remote memory and associated input data structures for the portion of the number of requests from the remote memory. Additionally, the request unit is to distribute the retrieved requests to the number of execution units based on availability for processing by the number of execution units.
In one embodiment, a method executes on a host processor. The method includes storing a number of requests for security operations within a request queue within a host memory, wherein the number of requests are in an order within the request queue. The method includes storing data related to the number of requests for security operations into a number of input data structures within the host memory. The method also includes allocating a number of output data structures within the host memory, wherein a coprocessor is to write results of the number of requests for the security operations into the number of output data structures. The coprocessor can write the results in an order that is different from the order of the requests within the request queue. Additionally, for each of the number of requests, a thread for execution on the host processor is allocated, wherein the thread periodically checks a value of a completion code stored in the output data structure for the associated request. The completion code indicates that the request is completed by the coprocessor.
In an embodiment, a method includes retrieving, by a request unit, a number of requests for security operations for a host memory, wherein the number of requests are in an order within the host memory. The method also includes distributing, by the request unit, the number of requests for the security operations to a number of execution units. The distribution is based on availability of the number of execution units. Additionally, the method includes processing the number of requests for the security operations by the number of execution units. The method includes outputting results of the number of requests for the security operations to locations within the host memory, wherein an order of outputting of the results can be different from the order of the requests within the host memory.