Distributed systems have evolved from highly parallel, structured and homogeneous systems to flexible, highly scalable, and heterogeneous systems. These systems communicate via message passing and may not assume a regular network architecture. Peer-to-peer (P2P) content sharing networks and other P2P voice networks are examples of such irregular distributed multi-computer systems. Highly optimized distributed data structures and algorithms have been developed for various distributed systems. However, the applicability of such data structures and algorithms, which are suitable for traditional distributed systems, needs to be revisited in the context of modern distributed systems such as distributed priority queues (DPQs). A DPQ is a fundamental data structure used in a variety of distributed applications.
Typically, distributed algorithms focus on optimal schemes to parallelize or distribute classic algorithms. In the case of DPQs, several algorithms have been developed to parallelize or distribute the heap-based priority queue data structure. These algorithms, however, do not preserve the locality of item; that is, items may be moved from node to node when the DPQ is re-heapified. It is also often necessary to know the number of nodes in a distributed system in order to map a DPQ algorithm. Furthermore, in many minHeap based DPQ algorithms, the root node becomes a processing bottleneck. These issues are not desirable for several modern distributed applications such as the distributed call center example described below.
A distributed call center application consists of loosely connected distributed agent nodes that can receive and service calls from external users. Agent nodes place the calls in a queue and start servicing them by an automated interactive voice response (IVR) system. Calls have to stay on the agent node at which they arrive, and associated data is generated while the calls are serviced. These queued calls that are being serviced can only be interrupted by the next available agent, which removes the head of the queue and answers the associated call. DPQ algorithms that move items from node to node each time an item is added or removed from the queue are not suitable for implementing such a call center. Such systems would benefit from a method of administering a queue that allows items to remain on arriving nodes, that maintains a global logical queue that is distributed across the network, and that does not have a bottleneck on any single node.
A priority queue is formally defined as an ordered set of items where any item i consists of the pair (Weighti, Recordi). Here Weighti is typically a numerical value which determines the priority of an item in the queue, and Recordi is the data associated with the item. Two basic operations that can be performed on a priority queue are: 1) insert, which inserts a new item i with a predefined priority Weighti into the priority queue and 2) deleteMin, which removes the item with highest priority or minimum weight (head of the queue) from the queue and returns it. (The phrases “item with minimum weight,” “item with highest priority” and “head of the queue” are used interchangeably hereinafter).
Each node stores a list of items, and collectively the nodes form a logical distributed queue. The insert or deleteMin requests can come to any node and in any order independent of other requests. There are several challenges to realizing a DPQ in such loosely coupled message passing networks, and among these is the fact that queue items are distributed across different nodes. Due to the lack of any specific architectural connection between nodes, it is difficult to track the head of the queue. Another challenge in these systems is that the deleteMin and insert operations can happen at any node in real-time. The DPQ framework should preserve priority order under such operations. Without any assumptions on the architecture, prediction of results of such concurrent operations is difficult. On the contrary, waiting to synchronize these operations through serialization is expensive and not scalable. DPQ algorithms in such systems should find a way to localize the effects of these operations but still maintain the logical DPQ.