Most computers presently exhibit performance limitations that arise from neither the CPU performance nor the operating speeds of memory devices. Rather, a computer's real performance, i.e. its useful throughput, is severely limited by the relatively slow speed of accessing memory devices. In other words, there exists a serious "bottle neck" between the CPU and the mass storage devices which slows overall system performance. What is needed is to improve overall system performance by speeding the access time to and transfer of data between the CPU (or local RAM) and a mass storage device.
Known Protocols and Device Controllers
Heretofore, access to mass data storage has been limited to devices that use high and low-level protocols and their supporting hardware. Disk drives, for example, communicate with the computer system through high and low-level protocols implemented in a combination of hardware and software, sometimes called a device controller.
The device controller, as such, acts as interpreter between the computer system and the storage device. For the computer system to transmit data to, or receive data from a storage device (e.g. a disk drive), the sequence of high-level protocol steps listed in the following Table 1 are carried out:
TABLE 1 ______________________________________ Generic High-Level Protocol ______________________________________ 1. Computer transmits command and data address to controller via system bus. 2. Controller reads command and decodes device and data addresses. 3. Controller translates system command into protocol commands. 4. Controller transmits protocol commands to referenced device. 5. Device responds to commands. 6. Controller transmits "ready to receive" to sending device. 7. Controller receives data from sending device. 8. Controller transmits "ready to send" to receiving device. 9. Controller awaits "ready to receive". 10. Controller transmits data to receiving device. ______________________________________
The basic protocol described above is repeated until the command is completed and is true whether the command is a READ or WRITE. Some kind of protocol is required to allow one device to communicate with another device. The device controller communicates with the storage device through one of several protocol definitions. The protocol is time consuming and thus slows the overall system. Examples of known high-level device controller to storage device protocols are SCSI (Small Computer Systems Interface), IDE (Integrated Drive Electronics), ESDI (Enhanced Small Device Interface), ST506 (a hard disk drive model number designation) and SCSI II (Small Computer Systems Interface ver. II).
Low level protocols are implemented at the basic hardware communications level where one device communicates with another device through a hardware medium. High level protocols always embed low level protocols to handle the low level communications between the controller and the device.
An example of a high level controller to storage device protocol with embedded low level protocols is shown in Table 2 below. Table 2 illustrates a SCSI system as defined in the American National Standard X3.131-1986 during a simple READ command. Here, the INITIATOR device is the SCSI controller and the TARGET device is a SCSI disk drive.
TABLE 2 ______________________________________ SCSI Read Protocol Overhead BUS PHASE ACTION ______________________________________ Bus-Free Wait for Bus Free condition. Arbitration INITIATOR drives BSY signal, places ID on bus INITIATOR determines priority INITIATOR drives SEL signal active Selection INITIATOR releases BSY INITIATOR sets TARGET ID and own ID on data bus INITIATOR drives I/O signal inactive INITIATOR asserts ATN signal INITIATOR waits for TARGET to drive BSY signal Message Out TARGET drives C/D and MSG signals active INITIATOR sends IDENTIFY to indicate logical unit Command TARGET asserts C/D signal TARGET negates I/O and MSG signal INITIATOR sends (READ) command (REC/ACK handshake) TARGET disconnects, asserts C/D, I/O and MSG signals INITIATOR reads disconnect (REC/ACK handshake) TARGET disconnects Arbitration TARGET drives BSY and ID active Reselection TARGET drives INITIATOR ID active TARGET drives SEL and I/O active TARGET releases BSY INITIATOR detects reselection INITIATOR asserts BSY TARGET drives BSY active and releases SEL INITIATOR detects SEL change INITIATOR releases BSY (BSY held active by TARGET) Message In TARGET drives C/D, I/O and MSG signals active TARGET writes byte on data bus (Logical ID of reselecting unit) TARGET asserts REQ (start of REQ/ACK handshake) INITIATOR reads byte INITIATOR asserts ACK TARGET reads ACK and releases REQ INITIATOR detects REQ change and releases ACK Data In TARGET drives I/O active TARGET drives C/D and MSG signals inactive Data transfer TARGET writes byte to bus (loop) TARGET asserts REQ INITIATOR reads byte INITIATOR asserts ACK TARGET releases REQ INITIATOR releases ACK Status TARGET asserts C/D and I/O signals TARGET negates the MSG signal TARGET places status byte on bus TARGET asserts REQ INITIATOR reads byte INITIATOR asserts ACK TARGET releases REQ INITIATOR releases ACK Bus Free TARGET releases all asserted signals INITIATOR and TARGET disconnect. ______________________________________
It may be observed from the foregoing example that known protocols cause significant delay in every mass storage access operation.
System Bus and CPU Local Bus Architectures
In general, a storage device controller (e.g., a disk drive controller) communicates with another controller type, such as the direct memory access (DMA) controller or the central processing unit (CPU), through a low level protocol implemented over the system bus. Examples of system bus architectures are the Industry Standard Architecture (ISA) and Extended Industry Standard Architecture (EISA). Additionally, there are many vendor-specific system bus architectures.
Data storage devices thus are isolated from the computer's system bus by the communications and high and low-level protocols implemented in the device controller. The device controller is isolated from the system's CPU and memory residing on the CPU local bus by a system bus interface and low-level protocols. This conventional architecture is illustrated in FIG. 1. Therefore, for the CPU to access stored data, the related commands and all transferred data must traverse multiple protocol levels. Computer systems can only access a disk drive through several protocols. Each protocol level, however, interjects delays and overhead. The overall effect upon the system is that system data throughput is bottlenecked at the protocol level. What is needed is to speed data transfer between the CPU and a storage device by reconsidering the prior art device controller architecture and associated protocols.
Multiple Data Transfers
Network and other file servers, (e.g. UNIX), transfer data twice across the system bus in order to transmit data stored on fixed media to the requesting device. First the data is transmitted from the storage media into system memory through the system bus. Next the data is transmitted through the system bus to the controller or port from which the request originated. From there, the data is transmitted to the requesting device, for example a PC connected on a local area network. The double transfer occurs whether or not direct memory access techniques are employed. Such multiple transfers are inefficient and reduce system throughput.
CPU Local Cache
Heretofore, before a CPU could access program executable images or data files, they were first transferred from storage into the computer's main memory using a mechanism such as Direct Memory Access (DMA). While the DMA occurred, the CPU was typically "asleep" waiting for transfer completion. Once the transfer was completed, a signal such as an interrupt notified the CPU that the program or data was ready. Only after the interrupt signal arrived could the CPU access the information or program image.
In the case of an executable image, the CPU would typically load the first part of the image into its local program cache and then execute out of program cache memory. The computer system would typically attempt to "stay ahead" of the CPU by loading the cache with some form of look-ahead algorithm. A similar construct was used for the CPU data cache. While CPU local cache memory is helpful, a need to improve mass storage access remains.
Prior Art Attempts to Improve Storage Access Speed
1. The "RAM Disk" or "RAM Drive"
Heretofore, access to data at CPU local bus speeds has been limited to segments of computer system memory addressed by an operating system dependent software device driver so as to emulate a disk drive. One of the problems with this solution is the size of the emulated disk drive (or "RAM drive"). The RAM drive is constructed from system memory, so it subtracts from available system resources and cannot exceed a fraction of total system memory size. Another problem is that a RAM drive must be loaded each time that the system is powered on.
RAM drives also are unable to correct any detected errors. In fact, under MS-DOS.TM., a RAM disk data error generates a parity error signal that will completely halt the machine with complete loss of all non-saved data. No RAM drive can operate as a bus master.
Battery backed-up system bus resident expansion memory RAM drives solve the power-on load problem as long as the computer is continuously powered or is turned off for less time than is provided by the battery backup. Even so, battery backed-up expansion memory RAM drives still require an operating system software device driver with its attendant overhead to operate and reside on the computer's system bus with all its attendant delays (described above).
2. Disk Drives
Disk drives comprise spinning media with mechanical arms that move a read/write head over the spinning media. The concept is similar in nature to a record player with the ability to pick the track to begin playing. However, a data file will typically occupy many tracks and requires many accesses. Every time that a new track is accessed, the mechanical arms must move. This operation is called a seek. The mechanical delays further exacerbate the problem of rapid data access. Mechanical disk drives, therefore, are part of the problem rather than the solution.
3. Semiconductor Disk
Semiconductor disks (SCD) are solid-state memory products that are used to emulate a mechanical disk drive. The advantage of an SCD is elimination of mechanical delays, thus providing a very fast access time to data. Many of these devices have error correction circuitry (ECC). Only very expensive models offer data scrubbing, wherein stored data is checked for errors on a continual basis and any correctable errors fixed. All SCD devices use a high-level protocol, such as SCSI, for a communications interface. Therefore, they are subject to both high and low-level protocol delays.
A typical SCD will have an access time of approximately 350 Microseconds. The fastest of these devices offer data access in approximately 125 Microseconds. This is how long the drive takes to start to provide information to the first protocol interface. Once the data has traversed the first protocol level, the data must then pass through the additional system bus protocol interface to get to system memory. So while SCD offers a fast storage medium, it does nothing to relieve the existing protocol "bottleneck" between the CPU and the storage medium.
4. Disk Caching
Another attempt to address the high-speed data access problem is disk caching. Caching technology is cumbersome and has a tremendous overhead. In caching technology, an attempt is made to keep the data most likely to be requested in high speed memory. Unfortunately, this has no benefit when accessed files exceed the size of the cache buffer. Some benefit is realized when small data segments are accessed repeatedly, but the overhead of tracking which data elements should be retained in cache memory and which swapped out reduces some of the gains realized. Moreover, disk caching devices reside on the computer system bus with all its attendant delays. Recall that the system bus is isolated from the CPU as described above with regard to FIG. 1.