Access to shared data structures, for example in a database management system, can generally be implemented either through locking or message-passing. In a locking approach, exclusive access to shared data is given to a thread that is currently acting on the data structure. Other threads needing access to that data structure are required to wait.
Many currently available software programs are written and optimized for execution on a single central processing unit (CPU) or perhaps more than one but relatively few CPU physical cores in a procedural approach that includes synchronization via locks and deep call stacks. Procedural programming approaches generally include use of procedures (e.g. routines, subroutines, methods, functions, etc.) containing a series of computational steps to be carried out as part of one or more operations to be performed on one or more data structures. Procedural programming can be considered as a list of operations for the CPU to perform in a linear order of execution, optionally with loops, branches, etc. Locks are a type of synchronization mechanism for enforcing limits on access to a resource in an environment where there are many threads of execution. A lock enforces a mutual exclusion concurrency control policy, for example to ensure that correct results for concurrent operations are generated as quickly as possible.
In contrast, approaches for heavily parallelized operation more typically employ message-passing, in which multiple CPU cores communicate over fast interconnect channels (e.g. in a same machine or between two or more discrete machines). In message-passing, a requestor sends a message (which can, for example, include data structures, segments of code, raw data, or the like) to a designated message-passing worker, and the message-passing worker executes code associated with the message (for example, based on message type) which in turn may generate further messages or generate a return message (which can, for example, include an operated-on data structure, segment of code, raw data, or the like). Processes can be synchronized in this manner, for example by requiring that a process wait for receipt of a message before proceeding. In case of message-passing, the worker executing the process does not actually wait, but rather processes further messages. The process effectively resumes once the reply message is received and starts processing. The code for processing a single message in a message-passing arrangement is generally lock-free and uses a very shallow stack. A lock-free algorithm (also referred to as a non-blocking algorithm) ensures that threads competing for a shared resource do not have their execution indefinitely postponed by mutual exclusion.
Generally speaking, a stack is a section of memory used for temporary storage of information. Message-passing approaches generally provide superior performance to procedural code, for example because data are properly partitioned and no additional synchronization besides message queues is generally required. Message-passing operations can be performed by a message-passing worker, which, as used herein, is intended to refer to a type of thread or other operator for performing a set of instructions that implement a message-passing approach.
Performance problems can also manifest when the number of worker threads is limited while, at the same time, the number of objects can be much larger. Such an arrangement often results in a worker thread being assigned several objects. However, when a worker thread frequently switches between two or more objects to process, the CPU cache can be thrashed, thereby causing performance to degrade or collapse.