The present invention relates to data processing by accelerators, and more specifically, to a system of extendible input/output data mechanisms for use by an accelerator.
In general, a central processing unit (CPU), or host, offloads specific processing tasks to accelerators to reduce the workload on the CPU. The use of accelerators, such as field programmable gate arrays (FPGAs) and graphics processing units (GPUs), to process specific tasks is becoming more wide spread.
Currently, an interface between the host and an accelerator is implemented as a queue on the host, which queues jobs for the accelerator to be worked on asynchronously. A control structure is typically used by the host to convey the job information such as what operations are to be executed by the accelerator, locations of the input data in memory and locations in memory to write the output data. These data location values are traditionally static for the life of the job, which is from the creation of the control block in the queue until the job is complete. The static nature of the data location values limits the job to a fixed amount of input/output data which must be determined before the creation of the control structure.
In many cases, the specification of the entire input/output areas in host memory before the creation of the control structure may require locking a large amount of data/memory space for the entire duration of the job including time spent queued for the accelerator. In addition, since the amount of data output may not be known at the time of creation of the control structure, a worst case estimate is commonly used to reserve adequate space for the output data.