One application of the inventive method for operating a disk storage system is in a network such as a local area network.
A Local Area network (LAN) for handling video data is illustrated in FIG. 1. The network 10 comprises a shared transmission medium 12 and a plurality of stations 14 connected to the shared transmission medium. In addition, a server 16 is connected to the shared transmission medium. A disk storage system 20 for storing a plurality of video files is connected to server 16. A plurality of the stations 14 typically wish to access the disk storage system simultaneously to retrieve video files stored in the disk storage system or to write video files into the disk storage system. As indicated above, the invention is also applicable to a stand-alone system wherein a disk storage system is accessed to retrieve streams for display on a local monitor or a disk storage system is accessed to store streams received from an external source.
Streaming data differs from transactional data as follows. With transactional data applications, the data rate associated with the traffic source is highly variable, i.e, it exhibits a high peak-to-average ratio. In contrast, the data rate associated with the transmission of a stream is relatively constant and is generally higher than the average rate associated with a transactional source.
In stream oriented applications such as video, the process according to which data is produced by a source (e.g., a disk storage system) and consumed by a destination (e.g., a decoder at an end-user station) is relatively continuous and steady. As far as the storage system is concerned, two distinct but similar operations are of interest: (i) the recording of streams produced by sources in which case the storage system is the consumer, (ii) the playback of previously recorded stream-oriented files in which case the storage system is the source. These processes are schematically illustrated in FIG. 2 and FIG. 3, respectively.
The storage requirements for streaming data such as video and multimedia data are different than the storage requirements for typical LAN data which is transactional in nature. The size of the files is an order of magnitude greater or more. Even with compression techniques, the physical storage needs are large. While laser disks and CD ROMs provide cost effective storage, they are awkward for providing simultaneous access to multiple users. A preferred storage medium for video files is the magnetic disk storage system.
A magnetic disk storage system 20 is illustrated in FIG. 4A. The disk storage system 20 comprises a plurality of disk drives 200. Each disk drive 200 comprises a disk 21 and a controller 210. The disk drive 200 is shown in greater detail in FIG. 4B. As shown in FIG. 4B, the disk 21 of the disk drive 200 comprises a plurality of platters 22. Each platter 22 has one or two magnetic surfaces, a bottom surface 23 and/or a top surface 24, for recording. Associated with each recording surface 23 or 24 is a read/write head 26. In the disk 21 of FIG. 4B, let h denote the number of heads, and thus, usable surfaces. The heads 26 are moved to particular locations on the platter surfaces 23,24 by the actuator 28 which is controlled by a controller 210. Thus, the controller 210 controls the proper positions of the read/write heads 26 and the transfer of data in both directions between the magnetic surfaces and a local buffer 30 which forms part of the control 210. The control 210 also manages the transfer of data across the SCSI bus 220 (see FIG. 4A) into and out of a buffer internal to the adapter 230. The adapter 220 is then in charge of the transfer of data, via the system bus 250, into and out of the server computer system 16 which includes the memory of 260, CPU 270, and network interface 280. In the case of a stand-alone system, the computer system 16 may not be a server and may not include a network interface.
As shown in FIG. 5, each recording surface 23, 24 is divided into a number of concentric tracks. Tracks on all surfaces which are located at the same radius form a cylinder. The number of tracks in a cylinder is thus equal to h. Let c denote the number of tracks per surface (and thus also the number of cylinders), and consider the tracks (and thus cylinders) to be numbered sequentially 1, . . . ,c, starting with the outer track (cylinder). Each track is divided into a number of fixed size sectors. Due to the circular geometry of the surface, the number of sectors in a track is not the same for all tracks; there being more sectors in outer tracks than in inner tracks.
As shown in FIG. 6, the cylinders in the disk are divided into subsets of contiguous cylinders called zones, such that the number of sectors per track in a zone is the same for all tracks in the zone. We let Z denote the number of zones, and consider the zones to be numbered sequentially from 0 to Z-1 starting with the outer zone on the disk. In FIG. 6, the number of sectors in a track of zone i is designated .sigma..sub.i and the number of cylinders in zone i is designated k.sub.i. Note that not all disks are organized into zones.
The disk rotates permanently at a constant speed of R rotations per minute, and the read/write heads are moved all together from one cylinder to another, as needed. All I/O transactions are for an integral number of sectors, the specific number of which depends on the application. To limit the overhead caused by head movement when writing or reading a block of data, the sectors on the disk are used consecutively and sequentially, going from sector to sector on a given track, from track to track in a given cylinder, and from cylinder to cylinder.
An example of a disk drive is the HP C2240 drive which has h=13 read/write heads, a total of c=2051 cylinders, and a rotational speed of R=5400 rotations/minute. The 2,051 cylinders comprise 1981 data cylinders, 69 spares, and one for logs and maintenance information. They are organized into eight zones.
When a request for an I/O operation is placed in the disk storage system (say a read or write operation for some number of consecutive sectors), the heads are first moved to the cylinder where the first sector is located; the delay incurred in this operation is referred to as the seek time (X.sub.seek). The head corresponding to the appropriate track then waits until the first sector appears under it, incurring a delay referred to as the rotational latency (X.sub.ro). Once the first sector is located, the head begins reading or writing the sectors consecutively at a rate determined by the rotational speed; the time to read or write all sectors constituting the block is referred to as the transfer time (X.sub.transfer). Note that if the block of data spans sectors located on more than one track in a given cylinder, then a switch from one head to the next is made at the appropriate time, thus incurring a so-called head switch time. If the block spans sectors located on multiple cylinders, then a head movement from one cylinder to the next takes place, thus incurring a track-to-track seek time each time this occurs. Accordingly, in performing an I/O operation, a certain amount of time is required. To asses the performance of a disk supporting an application, an analysis of the time required in each transaction must be undertaken.
The total time required in performing a read or write operation for a block T.sub.I/O (block), is the sum of seek time, rotational latency, and transfer time. EQU T.sub.I/O (block)=X.sub.seek +X.sub.ro +X.sub.trans
FIG. 7 shows how the total time T.sub.I/O for a block is divided into, seek time, rotation latency, and transfer time. As shown in FIG. 7, the transfer time includes some head switch times and/or track-to-track seek times. It should be noted that seek times, rotational delays and transfer times may be random and not known a priori.
Note that to get the total time required to get the data transferred into the system's memory, one should also account for any additional time that may be incurred in contending for the SCSI bus, in transferring the data from the controller's buffer to the system's memory. However, as these operations take place to a large degree simultaneously with the transfer of data off the disk into the controller's memory, such additional delay is negligible and may be ignored.
The most important requirement on the storage system in supporting an active stream is to maintain the continuity of the stream. In the case of playback, data must be retrieved from the disk and made available to the consumer (e.g., a decoder) no later than the time at which it is needed so as to avoid letting the decoder underflow. Similarly, when a stream is getting recorded, the writing of data on the disk must keep up with the rate at which it is getting produced so as to avoid letting the buffer (e.g., the buffer 30 of FIG. 4B) overflow and thus losing data. Thus, to maintain continuity, every I/O operation must be completed within some stringent time constraint.
In view of the foregoing, it is an object of the present invention to provide a method for performing I/O transactions in a disk storage system so that the continuity of a plurality of streams is simultaneously maintained.