Data storage systems such as voice mail servers store and retrieve data such as phone messages for a large user base. This user base may consist of tens, hundreds or even thousands of users. Because they service a large user base, voice mail servers must be capable of simultaneously storing and retrieving multiple phone messages. The voice mail server stores each phone message in a set of disks known as a multidisk array. Users periodically retrieve these stored phone messages through a user interface. This user interface typically provides the user with several options for handling the stored messages such as reviewing, forwarding, deleting, or keeping stored messages. Typically, the voice mail server has the ability to store dozens of messages for each user. In addition, a voice mail server servicing hundreds or thousands of users may, during peak usage, be required to simultaneously store many dozens of voice mail messages.
Each stored phone message occupies a significant amount of disk storage space. Consequently, the voice mail server must have a correspondingly large data storage capability. This data storage capability is provided by the set of disks in the multidisk array. When a voice message requiring storage is received by the voice mail server, one of the disks in the multidisk array is selected, under a load balancing scheme, and the voice message is written to the selected disk. In order to balance the number of messages on each hard disk in the multidisk array, prior art load balancing schemes focused on selecting the disk in the multidisk array with the greatest amount of free disk space ("least full scheme") for recording incoming messages. However, this least full scheme presents problems in periods of high demand, in which a large number of requests to store phone messages are received over a short time interval and the load on the multidisk array is high. For example, if fifty voice messages are concurrently received by a voice mail server having a multidisk array containing twenty-five disks, each message would be routed to the disk having the most available disk space. Thus a queue length of forty-nine messages would develop as the second through fiftieth messages waited for access to the selected disk while the first message is written to the selected disk. A long queue of messages imposes an unacceptable delay on the voice mail server. Each queued message must be stored in random access memory (RAM) cache. As the example illustrates, a very long queue can quickly develop under a rigid least full scheme. This long queue may overflow the available RAM cache. As a result, the voice mail server may be forced to stop receiving messages during times of high demand. Such a situation is highly undesirable.
The least full scheme is particularly undesirable in situations in which a new disk has been added to the multidisk array. Because the new disk is empty, all disk write requests are routed to the new disk despite the fact that there are several other available disks that could be written to in a parallel fashion. As a result, the voice mail service experiences undesirable delay. If traffic on the voice mail server is high at the time the disk is added, the long queue of messages waiting to be stored on the new disk may result in the failure of the voice mail server to receive new incoming messages.
To circumvent the problems presented by a least full disk assignment scheme, some prior art systems exploit the ability of the multidisk array to write to several disks in a parallel manner. One way to use this parallel resource is to implement a round robin disk assignment scheme. Under a round robin approach, write requests are assigned to the disks in a fixed sequential order that repeats after every disk has been assigned a write request. For example, if a voice mail server having twenty-five disks concurrently receives fifty messages, the round robin disk assignment scheme will assign message 1 to disk 1, message 2 to disk 2, and so forth. When each of the twenty-five disks are assigned a message, the round robin approach will cycle back to the first disk. Thus, in the example, the 26.sup.th message will be queued to disk 1, the 27.sup.th message will be queued to disk 2 and so forth until all fifty messages are assigned to a disk.
As a result, a queue length of only one or two develops in the example. Although the round robin approach provides a shorter queue length than the least full scheme, the method is problematic because it does not provide a means for balancing the number of messages on each disk in the multidisk array in all possible scenarios. For example, if a 26.sup.th disk is added to the existing 25 disks in the voice mail server described above, the round robin disk assignment scheme will assign the same number of incoming messages to the first twenty-five disks as it will the 26.sup.th disk and the number of messages on the 26.sup.th disk will therefore remain out of balance with respect to the original 25 disks for an extended period of time. If the reason for adding the 26.sup.th disk is that the other disks were running out of room, the round robin assignment method is inconsistent with the reason for adding the new disk and may result in one or more disks actually being completely filled (a condition which should never occur).
In another example, if a large number of messages are deleted from a particular disk by users, the round robin approach will not compensate for the reduced number of messages on that disk relative to the other disks. The total load on the disk having fewer messages will be lighter than the other disks because it will be accessed, on average, fewer times by users retrieving stored messages. This disproportionate disk access may well result in data storage inefficiency that will pose further delay and an overloaded server.
Viewed from a different perspective, disks containing an excessive number of messages relative to other disks ("over-loaded disks") in the multidisk array have a higher probability of being accessed by users who are retrieving stored messages. Thus the overloaded disks are, on average, in use more often then the remaining disks. The round robin approach does not compensate for this disproportionate usage. Consequently, under the round robin disk assignment scheme, the voice mail server may experience delays as incoming messages are assigned to disks that are servicing a large number of requests for stored messages. In periods of high demand, these delays will affect the speed with which messages are retrieved or stored.
Accordingly, it is an object of the present invention to provide a system and method for improving the way a data storage system, such as a voice mail server, selects a disk in a multidisk array to record incoming messages.