The present invention relates to servicing of input/output (“I/O”) requests by a storage device controller. The present invention is described and illustrated with reference to an embodiment included in disk array controller that services I/O requests from a number of remote computers. However, alternative embodiments of the present invention may be employed in controllers of many other types of storage devices as well as in a general electronic server that carries out electronic requests generated by electronic client devices intercommunicating with the general electronic server.
FIG. 1 is a block diagram of a standard disk drive. The disk drive 101 receives I/O requests from remote computers via a communications medium 102 such as a computer bus, fibre channel, or other such electronic communications medium. For many types of storage devices, including the disk drive 101 illustrated in FIG. 1, the vast majority of I/O requests are either READ or WRITE requests. A READ request requests that the storage device return to the requesting remote computer some requested amount of electronic data stored within the storage device. A WRITE request requests that the storage device store electronic data furnished by the remote computer within the storage device. Thus, as a result of a READ operation carried out by the storage device, data is returned via communications medium 102 to a remote computer, and as a result of a WRITE operation, data is received from a remote computer by the storage device via communications medium 102 and stored within the storage device.
The disk drive storage device illustrated in FIG. 1 includes controller hardware and logic 103 including electronic memory, one or more processors or processing circuits, and controller firmware, and also includes a number of disk platters 104 coded with a magnetic medium for storing electronic data. The disk drive contains many other components not shown in FIG. 1, including read/write heads, a high-speed electronic motor, a drive shaft, and other electronic, mechanical, and electromechanical components. The memory within the disk drive includes a request/reply buffer 105 which stores I/O requests received from remote computers, and an I/O queue 106 that stores internal I/O commands corresponding to the I/O requests stored within the request/reply buffer 105. Communication between remote computers and the disk drive, translation of I/O requests into internal I/O commands, and management of the I/O queue, among other things, are carried out by the disk drive I/O controller as specified by disk drive I/O controller firmware 107. Translation of internal I/O commands into electromechanical disk operations in which data is stored onto, or retrieved from, the disk platters 104 is carried out by the disk drive I/O controller as specified by disk media read/write management firmware 108. Thus, the disk drive I/O control firmware 107 and the disk media read/write management firmware 108, along with the processors and memory that enable execution of the firmware, compose the disk drive controller.
Individual disk drives, such as the disk drive illustrated in FIG. 1, are normally connected to, and used by, a single remote computer, although it has been common to provide dual-ported disk drives for use by two remote computers and multi-port disk drives that can be accessed by numerous remote computers via a communications medium such as a fibre channel. However, the amount of electronic data that can be stored in a single disk drive is limited. In order to provide much larger-capacity electronic data storage devices that can be efficiently accessed by numerous remote computers, disk manufacturers commonly combine many different individual disk drives, such as the disk drive illustrated in FIG. 1, into a disk array device, increasing both the storage capacity as well as increasing the capacity for parallel I/O request servicing by concurrent operation of the multiple disk drives contained within the disk array.
FIG. 2 is a simple block diagram of a disk array. The disk array 202 includes a number of disk drive devices 203, 204, and 205. In FIG. 2, for simplicity of illustration, only three individual disk drives are shown within the disk array, but disk arrays may contain many tens or hundreds of individual disk drives. A disk array contains a disk array controller 206 and cache memory 207. Generally, data retrieved from disk drives in response to READ requests may be stored within the cache memory 207 so that subsequent requests for the same data can be more quickly satisfied by reading the data from the quickly accessible cache memory rather than from the much slower electromechanical disk drives. Various elaborate mechanisms are employed to maintain, within the cache memory 207, data that has the greatest chance of being subsequently re-requested within a reasonable amount of time. The disk array controller 206 may also elect to store data received from remote computers via WRITE requests in cache memory 207 in the event that the data may be subsequently requested via READ requests or in order to defer slower writing of the data to physical storage media.
Electronic data is stored within a disk array at specific addressable locations. Because a disk array may contain many different individual disk drives, the address space represented by a disk array is immense, generally many thousands of gigabytes. The overall address space is normally partitioned among a number of abstract data storage resources called logical units (“LUNs”). A LUN includes a defined amount of electronic data storage space, mapped to the data storage space of one or more disk drives within the disk array, and may be associated with various logical parameters, including access privileges, backup frequencies, and mirror coordination with one or more LUNs. Remote computers generally access data within a disk array through one of the many abstract LUNs 208–215 provided by the disk array via internal disk drives 203–205 and the disk array controller 206. Thus, a remote computer may specify a particular unit quantity of data, such as a byte, word, or block, using a bus communications media address corresponding to a disk array, a LUN specifier, normally a 64-bit integer, and a 32-bit, 64-bit, or 128-bit data address that specifies logical unit, and a data address within the logical data address partition allocated to the LUN. The disk array controller translates such a data specification into an indication of a particular disk drive within the disk array and a logical data address within the disk drive. A disk drive controller within the disk drive finally translates the logical address to a physical medium address. Normally, electronic data is read and written as one or more blocks of contiguous 32-bit or 64-bit computer words, the exact details of the granularity of access depending on the hardware and firmware capabilities within the disk array and individual disk drives as well as the operating system of the remote computers generating I/O requests and characteristics of the communication medium interconnecting the disk array with the remote computers.
The disk array controller fields I/O requests from numerous remote computers, queues the incoming I/O requests, and then services the I/O requests in as efficient a manner as possible. Many complex strategies for I/O request servicing are employed, including strategies for selecting queued requests for servicing in an order that optimizes parallel servicing of requests by the many internal disk drives. In similar fashion, individual disk drive controllers employ various strategies for servicing I/O requests directed to the disk drive, including reordering received requests in order to minimize the relatively slow electromechanical seeking operations required to position the read/write heads at different radial distances from the center of the disk platters.
The present invention is related to a somewhat higher-level optimization with regard to I/O request servicing. The disk array has no control over the order and timing of I/O requests received from the numerous remote computers that concurrently access the disk array. However, the disk array controller must attempt to service incoming I/O requests in such a way as to guarantee a maximum response time to requesting remote computers as well as to guarantee servicing of some minimum number of I/O requests per unit of time. Many types of application programs running on remote computers, including applications that display or broadcast streaming video or audio data, require that the data be received reliably at specified data transfer rates without interruptions in the flow of data greater than specified maximum interruptions times.
In FIGS. 3–5, referenced in this section, and in FIG. 6, referenced in the Detailed Description of the Invention section that follows, time-dependent servicing of I/O requests by a disk array controller is illustrated for three remote computers, “h1,” “h2,” and “h3,” and the disk array, “a.” In FIGS. 3–6, I/O request servicing is plotted along a horizontal timeline, and all four figures employ similar illustration conventions.
FIG. 3 illustrates a short time slice of desirable I/O request servicing by a disk array controller on behalf of three remote computers. In FIG. 3, and FIGS. 4–6 that follow, the horizontal axis 301 is a timeline, and I/O request servicing on behalf of remote computer “h3” is shown along horizontal line 302, I/O request servicing on behalf of remote computer “h2” is shown along horizontal line 303, I/O request servicing on behalf of remote computer “h1” is shown along horizontal line 304, and overall I/O request servicing by the disk array controller and internal disk drives is shown along the timeline 301. For the sake of simplicity, I/O request servicing is shown in FIGS. 3–6 as of either short duration, such as I/O request servicing represented by block 305 in FIG. 3, or of long duration, as, for example, I/O request servicing represented by block 306 in FIG. 3. Short-duration I/O request servicing corresponds to reading or writing data to the memory cache (207 in FIG. 2) and long-duration I/O request servicing corresponds to immediate reading data from, or writing data to, internal disk drives.
In FIG. 3, the large block of I/O request servicing 307 by the disk array controller comprises servicing of the individual I/O requests represented by blocks 308–311 on behalf of remote computer “h1,” blocks 305, 312, and 313 on behalf of remote computer “h2,” and blocks 306 and 314 on behalf of remote computer “h3.” For additional simplicity of illustration, it is assumed, in the examples illustrated in FIGS. 3–6, that the disk array controller can service only one I/O request at any given time. As noted above, disk array controllers can normally concurrently service hundreds or thousands of I/O requests, but the principles illustrated in FIGS. 3–6 apply to any fixed limit or capacity for concurrently servicing I/O requests, and since it is easier to illustrate the case of the disk array controller having the capacity to service only 1 I/O request at a time, that case is assumed in FIGS. 3–6. FIG. 3 illustrates a desirable I/O request servicing behavior in which I/O request servicing is fairly distributed between servicing of I/O requests for all three remote computers “h1,” “h2,” and “h3.” Such desirable I/O request servicing occurs when, for example, I/O requests are generated in time in a statistically well-distributed manner among the remote computers and no remote computer generates more than some maximum number of I/O requests per unit of time that can be serviced by the disk array controller using some fraction of the disk array controller's I/O request servicing capacity small enough to insure that sufficient capacity remains to concurrently service the I/O requests generated by the other remote computers accessing the disk array.
Unfortunately, the fortuitous desirable behavior illustrated in FIG. 3 may quickly degenerate into undesirable patterns of I/O request servicing due, in part, to high levels of I/O requests generated by one or more remote computers. FIG. 4 illustrates a short time slice of undesirable I/O request servicing behavior. In FIG. 4, remote computer “h2” is generating the vast bulk of I/O requests at the apparent expense of servicing of I/O requests for remote computers “h1” and “h3.” Note that the disk array “a” is servicing I/O requests at nearly full capacity, represented in FIG. 4 by the large blocks 401 and 402 of I/O request servicing activity. In the time slice illustrated in FIG. 4, the disk array controller services fourteen I/O requests on behalf of remote computer “h2,” while servicing only three I/O requests on behalf of each of remote computers “h1” and “h3.” Assuming that remote computers “h1” and “h3” have made additional I/O requests in the time slice illustrated in FIG. 4, the disk array is servicing I/O requests preferentially on behalf of remote computer “h2” at the expense of remote computers “h1” and “h3.”
This undesirable I/O request servicing behavior may arise for many different reasons. Remote computer “h2” may be faster than the other remote computers, and may therefore generate requests at a higher rate. Alternatively, remote computer “h2” may, for some reason, have faster communications access to the disk array than either of the other remote computers. As a third alternative, the I/O requests from all three remote computers may arrive at the disk array at generally equivalent rates, but either by chance or due to peculiarities of input queue processing by the disk array controller, the disk array controller may end up processing, at least during the time slice illustrated in FIG. 4, I/O requests on behalf of remote computer “h2” at a higher rate than it services I/O requests on behalf of the other remote computers.
FIG. 5 illustrates a simple throttling methodology that can be applied by the disk array controller to prevent one or some small number of remote computers from monopolizing I/O request servicing by the disk array controller. To practice this methodology, the disk array controller divides the timeline of I/O request servicing into discrete intervals, indicated in FIG. 5 by the vertical axis 501 and evenly spaced vertical dashed lines 502–506. The timeline shown in FIG. 5 starts at time t=0 and includes 6 discrete subintervals, the first subinterval spanning I/O request servicing between time t=0 and time t=1, the second subinterval spanning I/O request servicing between time t=1 and time t=2, and so on. In order to prevent monopolization of I/O request servicing by one or a few remote computers, the disk array controller services up to some maximum number of I/O requests for each remote computer during each subinterval. In the example illustrated in FIG. 5, the disk array controller services up to one I/O request for each of remote computers “h1” and “h3” during each subinterval, and services up to two I/O requests during each subinterval for remote computer “h2.” Thus, remote computer “h2” receives I/O request servicing from the disk array at up to two I/O requests per subinterval. If the subintervals represent 100 milliseconds, then remote computer “h2” can receive servicing from the disk array controller of up to twenty I/O requests per second. Of course, if a remote computer generates fewer I/O requests per second than the maximum number of I/O requests that the disk array controller can service for that remote computer, the remote computer will receive less than the maximum number of I/O requests per second. The disk array controller may contract with remote computers for a specified maximum rate of I/O request servicing, as shown in FIG. 5, or may alternatively provide each remote computer accessing the disk array with some maximum rate of I/O request servicing calculated to ensure adequate response times to all remote computers while optimizing the I/O request servicing load of the disk array.
The simple scheme of providing servicing of up to a maximum number of I/O requests per unit of time to remote computers can be used to prevent monopolization of I/O request servicing by one or a small number of remote computers, as illustrated in FIG. 4. However, this simple scheme may introduce inefficient and non-optimal operation of the disk array and disk array controller. In FIG. 5, each block of I/O request servicing, such as block 507, is labeled with the time at which the disk array controller receives the I/O request from the requesting remote computer. For example, the I/O request that initiated I/O request servicing presented by block 507 in FIG. 5 was received by the disk array controller at time t=2.1. However, servicing of this I/O request was delayed, as indicated in FIG. 5 by the position of the leading edge of block 507 at approximately time t=2.5, because the disk array controller was occupied with servicing of the I/O request represented by block 508 on behalf of remote computer “h2.” Note, again, that in the very simplified examples illustrated in FIGS. 3–6, it is assumed that the disk array controller can service only one I/O request at any particular time. However, as noted earlier, disk array controllers can normally process a great number of I/O requests simultaneously by distributing them among internal disk drives for parallel execution. However, the principle of I/O request servicing throttling is applicable regardless of the number of I/O requests that can be serviced concurrently by a disk array, and, since it is easier to illustrate the case in which the disk array can only handle one I/O request at a time, that minimal case is illustrated in FIGS. 3–6.
The major problem with the simple throttling scheme illustrated in FIG. 5 is that it can lead to blocking situations in which the disk array has I/O request servicing capacity that cannot be employed because of enforcement of throttling, although I/O requests are outstanding. This problem is illustrated in FIG. 5 in time subintervals 5 and 6. The disk array controller receives I/O requests from remote computer “h2” at times t=4.0, t=4.1, t=4.2, and t=4.4. The disk array controller also received an I/O request from remote computer “h3” at time t=4.0. In this example, no previously received I/O requests are outstanding. The disk array controller first services the I/O request received from remote computer “h2” at time t=4.0, represented in FIG. 5 as block 509, next services the I/O request received from remote computer “h3” received at time t=4.0, represented in FIG. 5 as block 510, and the services the I/O request received from remote computer “h2” at time t=4.1, represented in FIG. 5 as block 511. At the point in time when the disk array controller completes servicing of the I/O request presented by block 511, t=4.5, the disk array controller has already received two additional I/O requests from remote computer “h2.” However, the disk array controller has, in subinterval 5, serviced the maximum number of I/O requests allotted to remote computer “h2.” Therefore, the disk array controller may not begin servicing these additional I/O requests until time t=5.0, the start of the next subinterval. Unfortunately, at time t=4.5, the disk array controller has no other outstanding I/O requests. Therefore, as indicated in FIG. 5 by the lack of I/O request servicing between times t=4.5 and time t=5.0, the disk array controller is temporarily stalled, although it has capacity for servicing I/O requests and has outstanding I/O requests to service. Repercussions of this temporary stall can be seen in subinterval 6 in FIG. 5. At times t=5.0 and t=5.1, the disk array controller receives I/O requests from remote computers “h1” and “h3,” respectively. However, the disk array controller must first service the outstanding I/O requests received from remote computer “h2,” represented in FIG. 5 by blocks 513 and 514. Thus, servicing of the requests received at time t=5.0 and time t=5.1 is delayed unnecessarily, since the I/O requests received from remote computer “h2” could have been serviced between time t=4.5 and time t=5.0 but for enforcement of the throttling scheme by the disk array controller.
Manufacturers of disk array controllers, providers of network-based data storage, data intensive application program designers, and computer services end users have recognized the need for a better methodology for preventing monopolization by one or a few remote computers of servicing of I/O requests by disk arrays while, at the same time, providing more optimal I/O request servicing by disk array controllers.