Typical queuing systems contain multiple queues, which a user can access by pushing data onto a queue or popping data from a queue. The more complex queuing systems typically have a common pool of memory shared amongst all or a group of queues. The common pool of memory is used for storing queue entries as they are pushed onto individual queues. One flexible approach is for the common pool of memory to be managed dynamically.
In a dynamic queue system, the common memory locations that are available for use by queues are tracked and managed. This tracking mechanism is sometimes referred to as the free list or free queue. Dynamic queue systems usually support two queue operations: push and pop. A push operation will add an entry to a queue, while a pop operation will remove an entry. A push operation will cause memory to be de-allocated from the empty list and allocated to the particular queue. The data which is being pushed onto the queue is then stored at the newly allocated memory location. A pop operation returns the data from the queue, and the common memory pool in which the data is stored is then de-allocated from the queue and re-allocated to the free list.
With dynamically allocated memory, individual queues are formed by a linked list. For every queue entry, there is a pointer to the next entry in the queue. Each queue typically has a read and write pointer for the start and end of the linked list. The locations in the common pool which are not allocated to a particular queue are maintained as part of a separate linked list, with its own write and read pointer, or a simple head of stack pointer. Either way this is usually called the free list. Another method is to use a separate dedicated first in first out (FIFO) memory where the addresses of the un-allocated common memory are stored within an array.
FIG. 1 depicts a classic singly linked list queue structure, indicating a typical relationship between queue data 10, queue entries 12 and pointers 14. As shown in FIG. 1, the reference numeral 14 is used generically refer to a number of different types of pointers, such as: a read pointer; pointer to next; and write pointer.
FIGS. 2 and 3 depict physical and logical views, respectively, of a typical dynamic queue system with a linked list memory 16, a common memory pool 18 and the queue pointers: read 20, read next 22, and write 24, with queue pointers of each type being provided for each queue. The linked list memory 16 and common memory pool 18 are typically implemented as a single port random access memory (RAM), while the queue pointers 20, 22 and 24 are either register-based or stored in a RAM. In this example, the linked list memory 16 and common memory 18 each have eight memory locations addressed: A0, A1 to A7. Because every entry in the common memory pool 18 must have a pointer to the next entry, the linked list memory 16 and common memory 18 are required to have equal number of locations.
When allocated to a particular queue, a common memory pool location contains a queue entry. FIGS. 2 and 3 illustrate two queues: Queue 1 26 and Queue 2 28. When un-allocated, a common memory pool memory location is part of the free list 32, shown in FIG. 3. In this example, the start address of the free list is held in a stack pointer register 30 shown in FIG. 2. The physical view of FIG. 2 shows the memory contents, and the logical view of FIG. 3 displays the links from one entry to the next with respect to Queue 1 26, Queue 2 28 and Free List 32.
FIGS. 4 and 5 show physical and logical views, respectively, of a push operation performed on the typical dynamic queue system of FIG. 2. In such a system, the following memory operations are needed for a queue push operation:
Step 34: A read of the queue read/write pointer memory 20, 22, 24 is needed to retrieve the read and write queue pointers of the queue being operated on. This step is sometimes not needed as the queue pointers can be stored in registers instead of RAMs.
Step 36: Write the new queue entry 38 into the common memory 18. The free list stack pointer value 30 is used as the address to the common memory where the queue entry is to be written. This is typically a 1 cycle write operation to the common memory. As the stack pointer is stored in a register, the address for common memory is immediately available.
Step 40: Update the free list 32 by removing the newly allocated common memory. This is done by reading the linked list memory to get the next item in the free list and updating the free list stack pointer 30. This is typically a 1 cycle read operation of the linked list memory 16. The stack free list stack pointer register is then updated on the next cycle.
Step 42: Update the queue write pointer 24 with the current contents of the vacant stack pointer register 30. This effectively moves the write pointer to the latest entry in the queue. This is typically a 1 cycle write operation to the queue read/write pointer memory.
Step 44: Update the queue linked list 16 with the pointer to new entry. This takes the last entry currently in the queue and creates a link to the newly pushed data, effectively increasing the linked list by one. This is typically a 1 cycle write operation to the linked list memory at the address pointed to by the current write pointer.
FIGS. 6 and 7 show physical and logical views, respectively, of a pop operation performed on the typical dynamic queue system of FIG. 2. The following memory operations are needed for a queue pop operation:
Step 46: A read of the queue read/write pointer memory 20, 22, 24 is needed to retrieve the read and write queue pointers of the queue being operated on. This is sometimes not needed as the queue pointers can be stored in registers instead of RAMs.
Step 48: Read the queue entry from the common memory pool 18 using the queue read pointer 20 for queue 1 as the address to the common memory. This is a 1 cycle read operation of the common memory.
Step 50: Update the read pointer 20 for queue 1. The next read pointer value is written into the current read pointer register. This is a 1 cycle write operation into the read pointer register.
Step 52: Update the next read pointer 22 for queue 1. The linked list memory 16 is read to retrieve address of the entry following the next read pointer. The next read pointer value is used as the address to the linked list memory and the data returned is written into the next read pointer register 22. This is a 1 cycle write operation into the next read pointer register, and a 1 cycle read operation of the linked list memory.
Step 54: Place the newly un-allocated memory at the top of the free list stack 32. The stack pointer 30 is updated with the current read pointer value. This is a 1 cycle write operation into the free list stack pointer register.
Step 56: The linked list value of the newly un-allocated memory is updated to point to the next entry in the free list. The read pointer value is used as the address to the linked list memory and the data written is the current value of free list stack pointer.
It is important to note that dynamic queue system implementations are usually pipelined. Therefore, many of the memory transactions previously described will occur concurrently. Because the memory transactions are pipelined, a push or pop operation will always encounter latency in completing. Regardless of latency, for maximum performance of a dynamic queue system, the system must complete a push or pop operation at every clock cycle.
Each push or pop queue operation requires one transaction to the common memory pool and two transactions to the linked list memory. Typical hardware implementations will use a single port RAM or a bank of single port RAMs for the common pool memory. The selection of RAM for the linked list memory is more critical as the two memory transactions are required. The least costly storage in terms of area utilization is to use a single port RAM for the linked list memory, but the performance impacts are such that the system would be limited to one push or pop operation every two clock cycles.
Alternatively some systems utilize register based storage for the linked list. This option is feasible for systems that provide only a small amount of queue storage. For larger dynamic queue systems, a register based approach does not provide the density when compared to RAMs.
One more approach is for the linked list memory to use a dual port RAM as these can perform both a read and write within the same clock. This meets the performance goals. However, again the hardware area of the linked list storage can quickly approach the footprint of common pool memory, making such a system very costly and effectively mitigating some of the advantages of a dynamic queue system. In these situations it can take more hardware resources to maintain the linked list storage then the actual queue entries themselves.
For ASIC (Application Specific Integrated Circuit) based implementation, queuing systems are required to be flexible and high performance while minimizing hardware resources. To maximize bandwidth, queues should be accessible for push and pop operations at every clock cycle.