The present disclosure relates to an apparatus for controlling persisting of data to disk.
In recent years, the ability of application programs to communicate with each other or with system provided services in a computer system or network without having to become involved in the complexities of particular operating systems or communication protocols has been much enhanced by the development of Message Oriented Middleware (MOM) systems. MOM comprises software executable on a computer operating system which provides a common programming interface by means of which applications can communicate with other applications without specific knowledge of the different operating systems and/or protocols which may be used by those applications.
Some MOM products employ message queuing which allows programs to send and receive application specific data to each other without having a private, dedicated logical connection established between them. Instead, the applications communicate using messages containing a message descriptor and the application specific data. The messages are held on queues by a queue manager. The queue manager is effectively the runtime component of the MOM product and may also be referred to as a messaging server.
With the increasing demand for high performance messaging, there is a current focus on identifying mechanisms for more efficient processing within messaging systems.
Current messaging systems typically enable a delivery model to be specified such that, for example, messages that contain business critical data are persisted to disk (or other forms of reliable storage device), such that a message can still be recovered and processed following e.g., machine reboots and application failures/shutdown. In more detail, a queue can be configured with a defined quality of service such that messages stored using the queue are not lost under any circumstance (e.g., network power outage)—messages containing business critical data are sent to such a queue and subsequently persisted to disk. Note that a queue is an artifact that provides a layer of abstraction and administration from the disk itself.
Some messaging systems extend this concept further and provide a message reliability level for finer grained control associated with the quality of service provided with regards to the conditions under which messages can be discarded. In some scenarios, it is acceptable for messages to be discarded, e.g., if a server computer system becomes resource constrained, while in others, it may be preferable to retain messages even at the risk of overloading the server computer system.
For example, in a system 100 shown in FIG. 1, a client application 105 generates one or more messages 106, 107 and 108 and sends each of the messages to a message queue 120 of a messaging system 115 residing on a server computer system 110. Message reliability levels having a plurality of values can be associated with the message queue 120 such that e.g., the messages 106, 107 and 108 can be persisted to disk.
If the reliability level is fairly low, causing at least some messages to be discarded, there is a risk that messages are lost e.g., at restart of the server computer system or when the server computer system has a heavy workload. However, as the reliability level is increased, causing at least some messages to be written to disk, this causes an overhead on the messaging system and associated system performance decreases. Further, read/write operations to disk can be slow.
Although clustering of physical disks can be used to aid the situation, the associated costs of e.g., extra hardware and power consumption does not make it an attractive solution for messaging systems—a more efficient approach is required to ensure that the physical disk does not become the system bottleneck while still providing for reliability.
There are a number of current solutions which address the above constraints and bottlenecks
In one solution, message data are persisted to disk and a copy of the data are also cached to enable data to be accessed without e.g., accessing the disk for read operations—although this solution offers improved read access to the data, the same volume of data still has to be written to disk.
In another solution, asynchronous I/O is used to write message data to disk—although this solution improves the performance when writing to disk, the same volume of data still has to be written to disk. Also, there is a risk that messages are lost e.g., if a messaging system fails before the asynchronous write is completed.
In yet another solution, message compression can be used to optimize the size of message data being written to disk—however, this solution can be a CPU intensive operation and further, if a message is small (e.g., less than a few KB in size) compression can actually cause an increase in a message's size.