The present invention relates generally to processors and processing systems, and, more particularly, to processor that transfers task context information between processing cores and hardware accelerators.
Conventional data processing systems include processors, such as general purpose processors and digital signal processors, and hardware accelerators that operate in tandem with each other. Hardware accelerators are functional units that perform computationally intensive operations. The hardware accelerators enhance the performance and efficiency of conventional data processing systems. Hardware acceleration is required in 3D graphic processing, signal processing, spam control in servers, cryptography specific operations, and the like. Examples of hardware accelerators include cryptographic co-processors, compression accelerators, pattern-matching accelerators, encryption hardware accelerators, and input/output (I/O) accelerators such as security encryption controllers, Ethernet controllers and network-attached storage accelerators.
FIG. 1 is a schematic block diagram of a conventional data processing system 100 that includes a processor 102, a register context memory 104, an addressable memory 106, an accelerator scheduler 108, a data bus 109, and first through third accelerator cores 110-114. The processor 102 includes a register 116 that stores context information corresponding to a program task executed by the processor 102. The register 116 may comprise multiple types of registers, for example thirty-two (32) general purpose registers and some additional special purpose registers.
The register 116 is connected to the register context memory 104. The processor 102 also is connected to the addressable memory 106 and the accelerator scheduler 108. The addressable memory 106 and the accelerator scheduler 108 are connected to the first through third accelerator cores 110-114 by way of the data bus 109.
The first accelerator core 110 processes a portion of the context information (also referred to as a “set of data”). The processor 102 executes an accelerator call instruction to forward the set of data to one of the first through third accelerator cores 110-114. Upon execution of the accelerator call instruction, the processor 102 performs an operand packing operation on the set of data i.e., modifies the set of data. The operand packing operation transforms the set of data into a consolidated set of data. The processor 102 also stores the set of data i.e., copies the modified set of data to the addressable memory 106.
The accelerator scheduler 108 receives a first accelerator identification (ID) corresponding to the first accelerator core 110 from the processor 102 and schedules the first accelerator core 110 to process the modified set of data. The first accelerator core 110 fetches the modified set of data from the addressable memory 106, processes it, generates a set of results, and stores the set of results in the addressable memory 106. The processor 102 fetches the set of results from the addressable memory 106, and performs an unpacking operation i.e., modifies the set of results by transforming the set of results into an unconsolidated set of results.
The overall performance and efficiency of the data processing system 100 is determined by the communication path between the processor 102 and the accelerator cores 110-114. The operations performed on the set of data such as operand packing and unpacking also determine the performance and the efficiency of the data processing system 100. As the modified set of data is stored in the addressable memory 106, the processor 102 has to perform load and store operations, which introduce latency in the data transfer operation between the processor 102 and the cores 110-114. The operand packing and unpacking operations introduce additional latency in data processing operations performed by the processor 102 and the cores 110-114.
It would be advantageous to have a processor or data processing system that reduces the latency in the data transfer and processing operations.