In many applications, it is necessary for one process executing on a computer system to communicate with one or more other processes executing on the same or other computer systems. The mechanism used to carry out these communications varies from system to system. One mechanism that has facilitated process-to-process communication in a variety of systems is a message queue. Processes send information to other processes by enqueuing messages in the message queue. The receiving processes obtain the information by dequeuing the messages from the message queue. Typically, these messages are read in a first-in first-out manner. Implementations of message queues are described in U.S. Pat. Nos. 7,181,482, 7,185,033, 7,185,034, 7,203,706, 7,779,418, 7,818,386, 7,680,793, 6,058,389, and 8,397,244, the contents of which are incorporated herein by reference in their entirety.
A message queue may be implemented in memory or on secondary storage, such as a magnetic disk, optical disk, or solid-state drive, or any other persistent secondary storage. An in-memory message queue allows queue operations to take place in memory, thereby reducing I/O latency. However, memory is generally a more limited resource. Thus, it may not always be assumed that a message queue can be completely implemented in memory.
An in-memory message cache that is backed by secondary storage may be used to store at least a portion of the messages in the message queue in memory. This allows for queue operations to be performed in memory without limiting the size of the message queue to the available memory. For example, database-backed queues may be architected to handle extremely large queues. In a database-implemented message queue, an enqueuing process uses a connection to the database, or an enqueue session, to enqueue messages, and dequeuers use dequeue sessions to dequeue messages. Various caching algorithms exist for selecting a subset of data to store in memory. These algorithms include suboptimal algorithms, such as first-in, first-out (FIFO) and least recently used (LRU), as well as optimal algorithms, such as optimal page replacement (OPT) for virtual memory swapping.
Conventional implementations of message queues do not scale well. Specifically, as the number of dequeue sessions increases, the contention for the “hot” messages at the head of the queue increases, thereby degrading performance. In addition, when the enqueue sessions and dequeue sessions are spread across several systems, the amount of communication on the network and/or interconnect between systems can become excessive. Sharded queues address some of these issues. A sharded queue includes one or more shards. Within each shard, the messages are ordered based on enqueue time. However, no message order is enforced between shards. Typically, a dequeue session dequeues messages from each shard in a first-in first-out order. However, no dequeue order is enforced between shards. Implementations of sharded queues are described in U.S. Patent Application Pub. No. 2014/0372486, U.S. Patent Application Pub. No. 2014/0372489, and U.S. Patent Application Pub. No. 2014/0372702, the contents of which are incorporated herein by reference in their entirety.
The performance of a system is affected by properties of the message cache, such as the size of the message cache (e.g. an amount of memory allocated to the message cache). When the size of the message cache is too large, unused portions of the message cache require memory, reducing the availability of memory for other system processes.
When the size of the message cache is too small, the overhead of moving messages between secondary storage and the message cache can consume excessive computational resources and can reduce system throughput. Furthermore, when the size of the message cache is too small, there is an increased chance of messages not being present in the message cache when a queue operation is requested. This causes I/O latency when processing the requested queue operation because the message must be retrieved from secondary storage at the time of the request. Such factors can significantly affect the performance of a system.
Thus, there is a need for techniques for message cache sizing.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.