The present invention relates generally to communications between tasks that are working cooperatively, for example on executing a database query, and more specifically to a queue protocol for enabling two tasks to communicate via a pair of queues in shared memory without having to use any synchronization constructs to coordinate use of the pair of queues.
In the context of a task execution tree, in which a parent task sends requests to one or more children tasks, each parent-child pair of tasks communicates in an asymmetric fashion. One task (the requesting task) provides requests and the other (the replying task) replies to these requests. A reply is a collection of entries that may be variable in size. Each of the reply entries is processed separately by the requester.
It is a goal of the present invention that the requesting task should be able to issue multiple requests in quick succession (essentially simultaneously, but in a defined order), and that new requests can be issued by the requesting task before replies have been received for the previously issued requests.
Additional goals or requirements of the present invention are:
Both tasks should be able to execute concurrently.
Requests from one task and replies by the other should be communicated to each other via a pair of queues that are in shared memory.
No synchronization primitives (semaphores, critical sections, spin-locks, etc.) should be required between the two communication tasks, even though they share the use of the pair of queues for communicating requests and replies.
Flow control should be accomplished by the use of a queue protocol between the tasks. Flow control is the mechanism used to prevent one task that produces excessive output from starving other tasks of resources and their ability to produce output, leading to deadlocks and other inefficiencies. The queue protocol should block operation of any task that would overflow its output queue or underflow its input queue until the respective imminent overflow or underflow condition is removed by the task sharing the use of the respective queue.
It should be implemented in an efficient manner.
In summary, the present invention is a system and method for executing database queries in which a set of task data structures representing a directed graph of logically interconnected tasks is stored in a computer memory. The directed graph represents an execution plan for executing at least a portion of a specified database query.
Also stored in the computer memory are a pair of queues for each pair of interconnected tasks represented by the set of task data structures. One of the queues in each pair is a down queue for sending requests from a parent task to a child task, and the other of the queues is an up queue for sending replies from the child task to the parent task.
Each queue is a circular buffer and includes a head pointer that points to a next location in the queue to be read, and a tail pointer that points to a next location in the queue in which data can be written. Each queue has an associated size. Further, a queue is empty when its head pointer is equal to its tail pointer; and it is full when the head pointer is equal to a next location in the queue after the location pointed to by the tail pointer.
Every task except the root task in an task execution tree reads requests from a down queue in a respective one of the pairs of queues, generates a corresponding result and writes the result as a reply into the up queue in the respective pair of queues. Similarly, all tasks except leaf node tasks write requests into one or more down queues and read the corresponding replies in the corresponding up queues.
Each task checks that a queue is not full before writing data into that queue, and checks that the sibling queue is not empty before reading data from the sibling queue. In addition, a task updates the tail pointer for a queue only after it has written data into the location in the queue to which the tail pointer is updated, to ensure that the other task does not attempt to read that queue location until the new data has been written into it. Further, each task writes and reads data to and from respective ones of the queues without first acquiring ownership of a corresponding synchronization mechanism. That is, the aforementioned queue usage protocol is sufficient, by itself, to ensure that two tasks do not make conflicting use of the queues, despite the fact that the pair of queues are in shared memory and are not protected by a synchronization mechanism. In effect, the head and tail pointers and the queue protocol rules act as a de facto synchronization mechanism.
In some implementations, the tasks in the task tree may be divided for execution into two or more processes. For instance, a first plurality of the tasks may be executed in a first process, and a second plurality of the tasks may be executed in a second process. A first interprocess communication task in first process and a second interprocess communication task in the second process exchange interprocess communication messages so as to communicate requests and replies between a first task in the first process and a second task in the second process. A first pair of queues is used for sending requests and replies between the first task and the first interprocess communication task, and a second pair of queues is used for sending requests and replies between the second task and the second interprocess communication task. The first and second interprocess communication tasks and the first and second pairs of queues operate together so as to simulate operation of a single pair of queues for communicating requests and replies between two tasks in a same process.