Field
Embodiments of the present disclosure generally relate to network traffic processing and memory access. In particular, embodiments of the present disclosure relate to direct cache access (DCA) for directing traffic from Network Input/Output (I/O) devices directly to processor caches.
Description of the Related Art
A typical computer system includes a host processor, a host memory, and a host cache. In existing technologies, memory access has increasingly become the performance bottleneck due to the speed discrepancy between CPU and memory. A central processing unit (CPU) cache is a cache used by the CPU of a computing device to reduce the average time to access memory, also commonly referred to as latency. Data located in cache memory may be accessed in much less time than that located in the host memory as the cache stores relevant data closer to the CPU, which is sufficient to locate the source of incoming data or packets. Thus, a central processing unit (CPU) with a cache memory needs to spend far less time waiting for instructions and operands to be fetched and/or stored.
While a host processor executes application programs that require access to data, the host cache temporarily holds data for use by the processor such that when the host processor needs to read from or write to a location in main memory, it first checks whether a copy of that data is present in the host cache. If so, the processor immediately reads from or writes to the cache, which is much faster than reading from or writing to main memory/host memory.
Direct cache access may be used to avoid system bandwidth overload and bandwidth restrictions by placing the data directly into the processor's cache before, instead of, or in parallel with placing the data into system memory. Direct cache access (DCA) is information processing system protocol that permits data from an input/output (I/O) device to be placed into a corresponding cache based on protocol aware applications.
Even with the advent of DCA, existing solutions are still not able to optimize the amount of data that should be transferred and written into the CPU cache. If less than optimal data is transferred into the CPU cache, cache misses result in memory access penalties. Similarly, if more than needed data is transferred into CPU cache, other relevant data may be evicted from the cache, thereby causing a cache miss and resulting memory access penalties. The current systems and communication techniques therefore result in inefficiencies with respect to performance and speed.
In view of the foregoing, there is a need for improved DCA schemes.