Background and Relevant Art
Computer systems and related technology affect many aspects of society. Indeed, the computer system's ability to process information has transformed the way we live and work. Computer systems now commonly perform a host of tasks (e.g., word processing, scheduling, accounting, etc.) that prior to the advent of the computer system were performed manually. More recently, computer systems have been coupled to one another and to other electronic devices to form both wired and wireless computer networks over which the computer systems and other electronic devices can transfer electronic data. Accordingly, the performance of many computing tasks are distributed across a number of different computer systems and/or a number of different computing environments.
In some computing environments, messages are temporary stored (or “queued”) in a queue prior to processing. The queue provides buffer capabilities to compensate for differences connection speeds, to permit asynchronous communication, etc. A queue and the service that utilizes data from the queue are typically run on a single computer system. Unfortunately, this can result in a bottle neck for data processing. As the number of other computer systems sending data to the queue increases, the response time of the service decreases. At some volume of data, the queue and/or the server may lack sufficient resources to process the data in a timely manner (or at all).
Further, typical queue arrangements result in a single point of failure for the server. That is, if the queue or machine where the queue is running malfunction or crash, queue state can be lost. When the queue is restarted, there may be no way for the queue to regain the lost queue state. Accordingly, computer systems may be required to resubmit data to the queue to get it processed.
In general, the potential for a data bottle neck and/or loss of queue state tends to reduce queue availability. That is, if a queue is overwhelmed or busy, other computer systems may view the queue as unable to process data. Further, when queue state is lost, other computer systems can also view the queue as unable to process data. In either case, the queue (even if running) is essentially unavailable for its intended purpose.
Machines can be clustered to provide increased availability for queues. For example, a database can be run on a cluster. Messages can be written to the database durably and then replicated to other machines on the cluster. However, clustering requires the allocation of resources for durable storage to increase availability. Thus, facilitating increased queue availability through clustering and durable storage may not be an efficient allocation of resources when the queued data is short lived and some data loss is tolerable.