Computerized systems are most generally used to maintain data. Data is created, modified, deleted, and read. In some types of systems, the worst-case time it takes to perform such operations is not important. That is, delays can be introduced when creating, modifying, deleting, and reading data, without affecting the needed average performance of the computerized system as a whole.
However, other types of systems, known as real-time systems, require that such worst-case delays be kept to a minimum, so that such systems essentially perform data-related operations in real-time, or in near-real-time. A real-time system may thus be considered a computer system that responds to operations by immediately updating the appropriate data and/or generating responses quickly enough to ensure that the system meets its response-time criteria. Therefore, delays that are introduced when creating, modifying, deleting, and reading data can hamper a system's ability to operate in real-time.
Some types of computerized systems use multiple processors. Such multiple-processor systems have to ensure serialized execution of critical sections of computer code that manipulate shared data structures. For example, if the data of a shared data structure is updated by one processor before it is read by another processor, it is important to ensure the order of these operations. That is, it is important to ensure that the data read by the latter processor is the updated version of the data as updated by the former processor. To ensure such serialized execution, various mechanisms for mutual exclusion can be employed. Mutual exclusion mechanisms ensure, for instance, that the data of a data structure is not read by one processor while another processor is currently updating that data.
Mechanisms for mutual exclusion that have been traditionally been used include spin locks, semaphores, reader-writer spin locks, and non-blocking synchronization, among other types of conventional such mechanisms. Even single-processor systems may require controlled concurrency when critical section code can be executed from both a process context as well an interrupt context. That is, during the updating of the data of a data structure by a process being executed by a processor, the processor may receive an interrupt which causes it to read that data. Therefore, it is important for the processor to recognize that the interrupt should not result in reading of the data until the process has finished updating the data.
For instance, for a spin lock, a process cannot update, or possibly cannot even read, a section of data until it acquires a lock on that data, such that it waits or “spins” until the lock can be acquired. While short-term mutual exclusions like spin locks are simple to use, with the advent of faster processors and memory interconnect speeds not keeping up with the speed of such processors, the cost of acquiring spin locks increases with each generation of computer architecture. The wider this gap is, the more cycles a processor has to wait for a slow memory interconnect to respond. Therefore, it has become increasingly necessary to look for alternatives to conventional spin-waiting locking models. This is especially true in the case of real-time systems.
Read-copy-update (RCU) is one such alternative mutual exclusion approach. In RCU, readers, which are threads or processes trying to access, but not modify, data, can access shared data without having to acquire any conventional type of lock. However, writers, which are threads or processes trying to update such data, have to use a special callback scheme to update the data. They update all the global references to the updated data with a new copy and use the callback scheme to free the old copy after all the processors have lost or released all local references to the data.
Because the write side of RCU is significantly more expensive in terms of execution time as compared to the read side, RCU is best suited for scenarios in which the data to be protected is read more often than it is written. For single-processor systems, RCU eliminates the need to mask interrupts for mutual exclusion purposes. RCU is thus suitable for mutual exclusion in network routing tables, device state tables, deferred deletion of data structures, and multiple-path input/output (I/O) device maintenance, among other applications.
However, the read side of such so-called “classic” RCU, while having nearly zero if not zero overhead to perform such a RCU read-side critical section (of code), is nevertheless not well suited for usage in real-time systems. This is because classic RCU disables preemption during RCU read-side critical sections. Preemption allows a high-priority realtime task to interrupt, or preempt, the execution of a lower-priority non-realtime task, thereby permitting the realtime task to attain its response-time goal. Therefore, disabling preemption can degrade realtime response time or latency. While some real-time applications can tolerate such degraded latency, many more absolutely cannot.
Other types of RCU are adapted for usage in real-time systems, but require significant overhead in performing an RCU read-side critical section. For instance, readers of a data structure commonly employ memory barriers so that they do not have to acquire any type of conventional lock on the data structure. A memory barrier is an explicit instruction to a processor that causes the processor to order read and writes to memory. That is, a memory barrier is more precisely a memory barrier instruction that places constraints on the order of execution of other instructions, such as read and write instructions. As such, the processor cannot reorder read or write accesses (i.e., memory loads and stores) across the memory barrier.
For example, a section of code may include three read or write instructions, followed by a memory barrier instruction, followed by another three read or write instructions. A processor executing this section of code may reorder the execution of the first three read or write instructions relative to one another, and may reorder the execution of the last three read or write instructions relative to one another. However, because of the memory barrier instruction, the processor is not allowed to reorder the first three read or write instructions relative to the last three read or write instructions, and vice-versa.
Utilizing memory barriers adds significant overhead to such real-time read-side RCU critical section. Such instructions are expensive in terms of added overhead, because they may be performed thousands of times slower than other operations. Furthermore, existing real-time RCU approaches may also employ atomic instructions, where atomicity means that a number of instructions are all performed, or none of them are. Atomic instructions are also expensive in terms of added overhead, and also may be performed thousands of times more slowly than other operations.
There is thus a need within the prior art for improved RCU performance within real-time systems, as well as within other types of systems. More specifically, memory barriers and atomic instructions should be used within the read side of RCU as sparingly as possible. For these and other reasons, therefore, there is a need for the present invention.