Peripheral devices often use host memory for data transfers. Software on a host may request that a supplying peripheral write data into host memory, and wait for acknowledgement of completion before requesting that a consuming peripheral read the data from host memory to complete the data transfer.
However, data transfers to and from host memory consume two units of host memory bandwidth for each unit of data transferred. Further, a host processer may issue each transfer request to the peripherals and receive each completion notification from the peripherals, introducing processer response time latency into the data transfer as well as increasing overhead for the processor, taxing both host memory resources and host processor resources.
If the portion of host memory for a data transfer is increased to reduce the effect of processor latency and to avoid excessive processor usage, additional latency may be introduced to the data transfer while waiting for the larger memory reads and writes. Alternatively, if the portion of host memory for a data transfer is minimized, the processor may issue transfer requests to the peripherals and receive completions at an excessive rate, increasing the burden on the processor and the introduced processor latency. In other words, there are often conflicting demands to use a small portion of host memory to minimize latency and memory usage on the one hand, or to use a large portion of host memory to allow for longer processor response times and minimize processor usage on the other hand.