Computer data storage and retrieval architectures have become highly evolved. On the one hand, large mainframe input/output subsystems, such as the IBM 3390, follow a hierarchical design having a plurality of controllers extending from a central processing unit which negotiate a control path between main memory and a particular mass storage device and provide control timing of the data transfer. A separate hierarchy of connections may be established between the main memory and the mass storage device as a channel over which the data to be stored or retrieved actually flows. High performance and low storage latency are achieved by providing many storage devices which are connected redundantly to multiple central processing units (CPUs) via hierarchical control and data paths. This particular arrangement closely or tightly couples the mass storage device to the CPU and thereby renders practical the adaptation of operating systems and applications programs to take advantage of certain characteristics of the mass storage device, such as the hard disk drive. One early data recording format exploiting this tight coupling is known as "count-key-data" or "CKD". CKD is a variable block length recording scheme in which each record is recorded in its entirety in accordance with a "count" field which specified its address and length of the data. The key field provided the record's "key" or search argument, and was followed by the data field comprising the record information. By using the key field, a preliminary search for records of interest could be carried out at the drive controller level rather than exclusively at the host CPU level. The CKD format is to be contrasted with the more modem and prevalent "fixed block architecture" (FBA) format.
Beginning about twenty years before the filing of this patent application, a merchant market arose for hard disk drives as storage device commodities sold separately from main frame and mini-computers. These merchant market hard disk drives found application and acceptance in computerized word processing systems and in other applications where floppy disk systems provided insufficient storage capability. Since these low cost hard disk drives developed to meet the growing limitation of floppy disk drives, interfaces between the computer-based applications and the hard disk drives developed along the lines of the floppy disk drive. Those low level interfaces simply specified e.g. a storage device, a volume number, a "cylinder", a head (which specifies one track within the cylinder of tracks) and a physical record or "sector" address for a block of data to be written or read back. The drive responded directly to the controller by stepping a head actuator to the cylinder location and selecting the appropriate head. In other words, the interface specified a physical storage space, as indicated by the specific cylinder, head and sector (CHS), which was made unique for each disk drive, and the controller knew the approximate state of the drive in real time. The so-called "ST-506" interface, meaning the interface used by the Seagate Technology ST-506 51/4 inch, 5 megabyte hard disk drive, found widespread use in the 1980s during the development and proliferation of the now-ubiquitous personal computer.
Later, hard disk drive designs became more sophisticated and began including embedded disk drive controllers. With this development it became practical for the disk drive itself to begin mapping logical block address (LBA) space to physical storage locations on the disk drive. The computer host and its drive controller specified LBAs which the on-board drive controller translated into physical cylinder, head and sector values, and direct control of storage addrresses on the drive thereupon became masked to the host controller. One example of an early bus level interface is provided by commonly assigned U.S. Pat. No. 4,639,863 of co-inventor Harrison and others entitled: "Modular Unitary Disk File Subsystem", the disclosure thereof being incorporated herein by reference. This plug-in PC bus level hard disk drive led directly to the development of the so-called "integrated drive electronics" or IDE interface, later followed by an "enhanced integrated drive electronics" or EIDE interface, which has become an industry standard drive interface convention.
Beginning in 1981, an industry standards working group developed and the American National Standards Institute (ANSI) later adopted a "small computing system interface" (SCSI) specification (X3.131-1986) which provides another higher level interface structure implementing some of the more advantageous channel disconnect and reconnect features associated with the mainframe mass storage systems such as the IBM 3390 subsystem mentioned above. A pioneering hard disk drive implementing SCSI is described in commonly assigned U.S. Pat. No. 4,783,705 to Moon et al. and entitled: "High Capacity Disk File with Embedded Sector Servo and SCSI Interface", the disclosure thereof being incorporated herein by reference. Over the years, the SCSI specification has evolved to provide wider bandwidth and higher data transfer speeds, but its command structure has remained quite stable.
In SCSI a command descriptor block (CDB) is issued by a SCSI initiator to a SCSI target over the bus. The CDB includes e.g. five bytes: operation code, LBA (2.5 bytes), block transfer length and control byte. The operation code tells the target how long the CDB will be and what operation, such as read or write, the target is to perform. The LBA tells the target where the user data is expected to be located in LBA space. A typical block size is 512 bytes of user data. The block transfer length tells the target how many blocks of data beginning at the LBA are to be transferred by this transaction or sequence. And, the control byte is used for special purposes such as command linking or queuing.
Regardless of the particular host-to-disk interface, the present state of the art is for a computer host to control LBA storage in a hierarchical or layered arrangement. As shown in FIG. 1A a host computer may include an application layer, an operating system library layer, a file system layer, a volume manager layer, and a device driver layer. The application, operating system library, file system, volume manager and device driver are most frequently embodied as software within the host computer or computing system.
A computing application which is active within the host computer at an uppermost layer may require a data unit, byte, block or record comprising one or more blocks to be written to, or retrieved from, e.g. disk storage. The request for the records is sent from the application layer to the next lower, operating system library layer. The operating system library together with the file system and volume manager layers are primarily responsible for scheduling accesses to storage including disk storage and for converting the specific request for a data record into a logical block address and block length value for transmission to the disk drive. This process includes identifying the particular storage resource which will be called upon to service the request by the application, and therefore specifies a particular device driver, which comprises a fifth level in the FIG. 1A depiction.
The device driver is conventionally a software control program or routine which includes the code needed to use an attached peripheral device, such as a generic hard disk drive. The driver is device-specific, and so it takes into account the particular physical and electrical characteristics of the device to enable the host to communicate with the device. While the device driver layer is intended to describe the physical characteristics of a particular peripheral device, in practice only a few generic devices are supported by available device drivers, and actual physical hardware may differ widely from the generic model supported by the particular device driver. The fact that there is a wide disparity between the logical model, as seen by the device driver software, and the actual physical hardware, and that this disparity is unknown to the host and its applications, is a fundamental drawback of the prior approaches which have led and contributed to the present invention.
Continuing with FIG. 1A, an interface structure IF enables the host computer to connect electrically to e.g. a peripheral storage device, such as a hard disk drive. The interface may be embodied as a cable or bus structure, and it may adhere to a signaling convention, such as EIDE or SCSI. Generally speaking, the interface connects to an embedded controller within the hard disk drive which receives and acts upon commands from the host, and returns status indications, adjunct to the primary function of recording and retrieving data blocks to and from internal storage disks.
As shown in FIG. 1B, the device driver receives the request for storing or retrieving a particular data block, e.g. block n, from the operating system, and maps out the address of block n within linear address space comprising the LBA of the hard disk drive. The embedded controller within the hard disk drive translates the LBA for block n into a physical cylinder, head and sector location (CHS) for block n in accordance with an internal static mapping or translation process which is not visible to the host application. Logical block n is always mapped to physical space m (except in the infrequent instance where physical space m is defective and the mapping is rerouted to a spare location m', shown in broken line in FIG. 1B). This static mapping process has the property that it is a one-for-one map, a one-for-one function: for every logical block at address n there is one and only one physical space at location m within the disk drive hardware. This static mapping process has the additional property that it is unknown to, and therefore transparent to, the host CPU. And, the process has the performance limiting property that it is dependent entirely upon the LBA values and does not take into account the type or kind of data or the use to which the data will be put by the application program or operating/file system.
In order to maximize data storage, disk drives typically implement data track zone formats on disk surfaces. Data track zones are needed to adjust to the fact that while the disk rotational velocity remains constant, the relative velocity between the disk and the flying head varies with the disk radius. By implementing data track zones, transfer rates are changed to be optimized for a minimum radius of a particular zone of data tracks. Data on tracks of outer zones is written and read at rates faster than data recorded on tracks of inner zones. A long record may extend across both zones and storage surfaces, thereby necessitating handling of rotational latencies, as well as track seeking and head select delays or interruptions. Since the mapping is carried out internally within the drive, the host computer and the host applications or disk operating system programs do not have direct control or knowledge of the data patterns directly written to the storage surface.
One important reason that the drive's internal mapping process is invisible to the host computer is that each disk drive product design may have many variables, such as number of disks and heads, number and location of radial data zones defined on each disk storage surface, data transfer rates for each of the data zones, block lengths, interleaves, error correction code processes, etc. Since the drive owns its own internal static data mapping process, the drive may very easily emulate a generic drive contemplated at the device driver layer or level.
For example, a real hard disk drive product (Quantum Bigfoot 1.2 AT) may have a single storage disk with two storage surfaces and one head per surface. Each surface of the real drive may be organized into 16 radial zones, and have 5738 tracks per surface (cylinders), for a total of 11,476 tracks on both surfaces. A total number of user data sectors is 2,512,844, with 512 user bytes per sector plus overhead. The outermost data zone includes 273 sectors per track, while the innermost data zone includes 144 sectors per track. All of the tracks are interrupted by 96 servo wedges or spokes which extend generally radially across each disk surface. Information not available to the host in the wedges is used for controlling head position and for locating user data block locations in a track being followed by a particular head. In this example the real drive has a formatted capacity of 1286.57 megabytes. Yet, an entirely different logical addressing format may be used to describe this real drive at the device driver layer of the host. One logical address format used to emulate this particular drive by widely available device drivers is 2492 logical cylinders, 16 heads and 63 sectors per track. While this virtual logical addressing format yields an overall storage capacity of 1286.11 megabytes, it presents a very misleading picture to the host computer of the actual physical geometry of the real drive. The host has no knowledge of high speed, high bandwidth outer zone locations, or of the low speed, low bandwidth inner zone locations. It has no control over whether a record will be mapped within a single zone at a uniform data transfer rate, or will be spread out across several zones and buffered by an on-board cache memory of the drive. The host has no idea whether the disk includes one storage surface, or 12 surfaces, or two heads or 16 heads. The host does not know if the sector patterns are skewed from surface to surface to accommodate head select latencies, etc.
Thus, one drawback of the current approach of fitting a virtually unlimited number of physical device designs into a relatively few number of logical addressing formats for which there are widely available device drivers, is that the real drive's physical plant storage space and layout design is not necessarily considered or optimized by the host in specifying or allocating LBA space. This is primarily because the use of 512 bytes of data as the conventional block size is no longer useful to current disk storage surface configurations from the disk drive's point of view, nor is it useful from an application's point of view. Another drawback is that certain data types, such as multimedia video and sound files, require a high level of continuity and speed in playback, which is or may not be achieved by the drive in carrying out its own internal data mapping arrangement.
Recently, it has been proposed to distribute mainframe-like computing functions at various locations within an enterprise or across a network. A number of proposals have been made for redundantly attached inexpensive (more recently "independent") drives (RAID) in order to assemble huge storage arrays out of smaller, redundantly mirrored or backed up, devices. Even more recently studies have been undertaken to define and specify "network attached storage devices". However, even with the introduction of distributed storage in parallel with the proliferation of the "worldwide web", the fundamental storage control strategy has remained that a host will specify an LBA and present data to a storage resource. In response to the request, the storage resource (e.g. hard disk drive) will map that LBA to a physical location without particular regard to the kind or type or size of data being sent for storage, or the particular characteristics of the storage location as either particularly suited or somewhat unsuited for the particular data, or the particular requirements of the host program or operating system.
Thus, a hitherto unsolved need has remained for an improved mass storage interface which enables dynamic remapping and storage of data within the device to be carried out, based upon a determination of the type or size of data being recorded and the characteristics and behaviors of available storage space within the device.