Currently, most computer programs use memory management mechanisms for dynamically allocating and deallocating memory blocks from their address spaces. When a program needs to allocate a memory block of a certain size, a dynamic memory allocation mechanism searches the available regions of the address space for a contiguous region that is large enough to accommodate the desired memory block size, and further, updates its book keeping data to indicate that the allocated region is no longer available. When the program no longer needs a memory block that it has previously allocated, the dynamic memory allocation mechanism updates its book keeping data to indicate that the memory block is available for future allocation.
In multi-threaded programs, multiple threads can concurrently use the dynamic memory allocation mechanism. In order to maintain correct program operation, the proper synchronization between threads is required when concurrently using the dynamic memory allocation mechanism. Without proper synchronization between threads many serious problems may arise, such as the allocation of the same memory block more than once or at the same time, or losing the ability to reallocate a deallocated memory block. These problems may lead the program to crash or to produce incorrect results.
The conventional approach for synchronization of access to data shared among multiple threads is the use of mutual exclusion locking mechanisms. A mutual exclusion lock protecting one or more shared data items guarantees that, at any time, no more than one thread can access the protected data. Before a thread can access the protected data, it has to acquire a lock. When the thread is done with the data, it can release the lock. Further, at any time no more than one thread can hold the same mutual exclusion lock on a data item. If a primary thread holds a lock and other secondary threads need to acquire the same lock in order to access the data protected by the lock, then these secondary threads will have to wait until the primary thread releases the lock in order to acquire the lock to access the desired data.
A straightforward approach to synchronizing access to the dynamic memory allocation mechanism among multiple threads is to use a single lock. The use of a single lock ensures that whenever a thread needs to allocate or deallocate dynamic memory blocks it has to acquire that lock, perform its desired memory management operation and release the lock. For the sake of better throughput on multiprocessor systems, more sophisticated implementations of dynamic memory allocation use multiple locks in order to allow some concurrency of execution between threads running on different processors whenever these threads need to perform dynamic memory management.
A common problem of all the above mentioned implementations that use locking is that the delay or crashing of even one thread can cause the dynamic memory allocator to be deadlocked, which in turn may cause the program to be deadlocked or unable to allocate dynamic memory. For example, if a thread crashes while holding a lock, without special help from the operating system it will remain unavailable indefinitely to other threads that may seek to acquire it.
Even if no threads crash, it is possible that a thread can be interrupted while holding a lock. If the interrupt signal handler needs to acquire the same lock and the thread will not be scheduled until the signal handler completes, then this situation can lead to deadlock. The signal handler is waiting for a lock that will not be released while the thread holding the lock will not be scheduled to run until the signal handler completes. For this reason, most systems prohibit the use of dynamic memory allocation functions in signal handlers.
An unconventional alternative concept to using locks is lock-free synchronization. Lock-free synchronization dates back to the IBM System 370, in which all threads have unrestricted opportunity to operate on shared data object. If an object is lock-free then it is guaranteed that whenever a thread performs some finite number of step towards an operation on the object, some thread, possibly a different one, must have made progress towards completing an operation on the object, regardless of the delay or crash failure of any number of other threads that may be also operating on the object. Therefore, if the dynamic memory allocation mechanism is implemented in a lock-free manner, then it will be immune to deadlock even if threads may crash or get delayed arbitrarily, and irrespective of thread scheduling decisions made by the programming environment scheduler.
Dynamic memory allocators known in the art are not lock-free, require special support from the programming environment and are not generally applicable or make trivializing assumptions. For example, it is trivial to design a lock-free memory allocator where each thread owns a separate region of the address space and can only allocate blocks from that region, and when a thread deallocates a block it just adds it to its own available blocks. However, such design can lead to unacceptable cases where one thread ends up with all available memory, while other threads are unable to allocate new blocks.
What is needed is a dynamic memory allocator that is: completely lock-free, independent of special support from the programming environment, that uses only widely-supported hardware instructions, is general-purpose, is immune to deadlock even with the possibility of crash failures, is immune to deadlock regardless of the thread scheduling decisions of the programming environment, can support an arbitrary dynamic number of threads, is not restricted to supporting a limited size of dynamic memory and does not need to initialize the contents of significant parts of the address space.