Embodiments of the present invention generally relate to computing and more specifically to a computer memory architecture for hybrid serial and parallel computing systems.
Parallelism has provided a growing opportunity for increased performance of computer systems. Many parallel systems are engineered to perform tasks with high or massive parallelism, but are not sufficiently scaleable to effectively support limited parallelism in code, and in particular, do not efficiently process serial code. In many applications, however, it is necessary to perform both serial and parallel processing. For example, contemporary personal computers (PCs) use a graphics processing unit (GPU) and a central processing unit (CPU) within the same system. The GPU is typically a separate subsystem from the CPU subsystem and each may be made by a different manufacturer and be provided on a different circuit board with dedicated resources. The GPU handles (among other things) the parallel processing of data streams while the CPU handles, among other things, user inputs, control and management of the GPU operation, etc.
These conventional approaches often do not allow for efficient execution of coordinated, mixed (i.e., “hybrid”) parallel and serial processing modes. For example, memory management functions, such as partitioning, cache levels, and consistency management, and so on, could be optimized differently in a parallel computing system as opposed to a serial computing system. Because different cache arrangements and techniques may be used in each mode, transitioning among processing modes is non-trivial and requires time and resources, as well as overall system organization.