The present invention is directed to data transfer systems and more specifically to a method and apparatus for optimizing the speed and efficiency of data transfer between two computer hardware devices having different input and output rates.
Optimizing an input flow to match an output flow is a common theme in all arts. A properly calculated input flow often yields higher efficiency of output flow than an improperly calculated input flow, even though the improperly calculated input flow may contain more material or data. In other words, a faster input rate does not always yield the optimum output rate.
Flow systems often include one or more storage areas, working areas, or “tanks” with a dynamic equilibrium afforded by balancing input and output so that the amount of material in the tank, or the amount of data in a storage area, remains at an efficient level and does not overrun the capacity of the tank or storage area.
An example of this principle is a freeway during rush hour. If the number of cars being allowed onto the freeway at the entrance ramps can be restrained to equal the number of cars exiting or passing through the freeway, then a maximum freeway efficiency can be achieved and a traffic jam avoided.
Data transfer rates are the rates at which data is transferred from one device or place to another device or place. Data transfer rates are often a measure of the performance of microprocessors, memory chips, ports, and peripherals such as disk drives, modems, printers, and other hardware devices (“peripherals”). If the sending device has a data transfer rate that is too fast, the receiving device may be overrun with data. This is a common problem when a central processing unit (“CPU”) sends data to a peripheral such as a printer. The opposite problem may occur when a peripheral sends data to a CPU. The data transfer rate of a CD-ROM drive, for instance, must be accurate for the video, music, or animation data to keep pace with the CPU and yet be played back without introducing pauses.
CPUs typically transfer data to and from a disk drive or memory chip for data storage or buffering. These data storage devices have input mechanisms for receiving data and output mechanisms for transmitting data. It is well known that as a disk drive or memory chip approaches its limit capacity for holding data, it becomes greatly inefficient. This is because all data transactions must be swapped with preexisting stored data using the limited free space available. Near the limit capacity, the amount of data being internally swapped becomes greater than the amount of data that can be transmitted out of the device, resulting in an extremely slow disk drive or a frozen memory chip. In the case of a full disk drive, the problem is caused by an accumulation of data that the disk drive is not necessarily trying to transmit out in order to make room for more incoming data. But in the case of a memory chip acting as a dynamic buffer for a peripheral, the problem is caused by an accumulation of data waiting to be transmitted out to make room for new incoming data. The output is relatively slow because a peripheral often converts data into a physical manifestation using relatively slow electromechanical steps. This causes an output bottleneck for a memory chip acting as a dynamic buffer.
As a dynamic buffer stores data waiting to be transmitted out to a peripheral, there is wasted data overhead in the temporary storage and manipulation of the data. Instead of immediately transmitting out newly received data, a dynamic buffer must keep track of the data by batching, storing, mapping, and swapping stored data. Tracking more data than can be transmitted out when a storage device is near its capacity requires a great deal of overhead that makes a slow throughput problem even worse.
Data storage buffers with expandable capacity are one way to solve the problem of an output bottleneck. But these are of little use when the input rate is always faster than the output rate. The capacity to expand the buffer has a limit that is soon reached and the buffer is overrun with data.
Another way to solve the problem of an output bottleneck is to modify the rate at which a transmitting device transmits data to efficiently match the rate at which a receiving device can receive data. For instance, a CPU may adjust the speed at which it transmits data to a peripheral device. The peripheral device may receive data only at a limited speed, usually related to its ability to pass the data out of the system to make room in the buffer to receive more data.
From a mathematical standpoint, the optimum speed of data input for a receiving device relative to optimized data output of a transmitting device is a problem of “related rates” solvable by differential equations. But mathematical solutions are often oversimplified and may have other disadvantages. It is noteworthy that the optimum data input speed is not always obvious for theoretical treatment. A very slow data input rate may yield the best results depending on the characteristics of the data, the data handling scheme (e.g., 32-bit versus 64-bit bus speed), and unforeseen factors inherent in the specific hardware that are difficult to capture in a mathematical model.
The disadvantages inherent in a mathematical model for optimizing data transfer may be circumvented by other methods of determining the optimum rate of data transfer between a transmitting device and a peripheral. For example, experimental trial and error is one possible method. The transmitting device performs a plurality of data transmitting trials and selects the trial with the best results as a model for future input/ouput (“I/O”) configuration. The trial and error method is often superior to theoretical and mathematical methods for determining an optimum data transfer rate since it is based on actual trials carried out on a specific system having a specific CPU and a specific peripheral.
An example of a run-time trial and error method for optimizing data transfer for specific hardware is included in a SPINRITE® software program for optimizing the number of sectors per track on a hard disk. The SPINRITE® software builds a table of data transfer rates achieved by writing and reading data to a hard disk using a different number of sectors per track for each trial. Since sectors are the smallest units of data that the computer can read into memory in a single step, the optimum data transfer rate should theoretically be calculable using a mathematical model. But the theoretical calculation can be faulty because so many parameters, such as the rotational speed of the disk and percentage of data errors, can vary from the mathematical model. The SPINTRTE® trial and error method circumvents the shortcomings of a theoretical calculation by running trials in real time on the physical hardware at hand and simply selecting an optimized number of sectors per track from the trial yielding the highest throughput.
Known methods also adopt various schemes for adjusting the data transfer rates between various parts of a computer by attempting to match I/O rates by changing the speed of the data. U.S. Pat. No. 3,648,247 to Guzak, entitled “Data Handling System,” for example, discloses the use of a buffer between a record reader and a data processor. A circuit varies the speed of the record reader to maintain the system in continuous operation without emptying or filling the buffer. U.S. Pat. No. 5,991,835 to Mashimo et al., entitled “Peripheral Data Storage Device in Which Time Interval Used for Data Transfer From Relatively Fast Buffer Memory to Relatively Slower Main Memory Is Selected in View of Average of Time Intervals During Which Data Blocks Were Recently Received From Host” (the “Mashimo reference”), seeks an efficient data transfer rate from buffer to disk by averaging the time intervals of delivery of the most recent 3–8 data blocks from the host. The Mashimo reference is directed to optimizing a hard disk cache using a set scheme. These references do not solve problems of unmatched data transfer rates that may exist between a CPU and its peripherals. They rely on a set scheme such as a static circuit or algorithm that is derived from a theoretical expectation of how the hardware should act.
Other methods match I/O rates by adding data handling capacity. For example, U.S. Pat. No. 4,860,244 to Bruckert et al. is directed to a buffer system for the I/O hardware of a digital processing system. The Bruckert system includes additional buffer paths so that a controller or bus adapter for I/O hardware may operate at speeds closer to processor and memory speeds, or at least not slow down transfers between other faster parts of the system. U.S. Pat. No. 5,842,042 to Clark et al. is directed to a buffer between two processors for packetizing blocks of data having different sizes and rates. The Clark buffer includes a bus-to-bus adapter for coupling two processors. The adapter allows multiple independent data transfer operations to occur simultaneously. The adapter also includes a mechanism for prioritizing and allotting service time to data transfer operations with higher priority. The Bruckert and Clark methods add data processing capacity to improve unmatched I/O rates. It may be possible to solve the problem of unmatched I/O rates by redesigning hardware to have greater data handling capacity, or by adding more memory. But this is expensive, and it does not solve the problem of how to match a CPU having a fast data transfer rate to a slow peripheral, such as a printer, without adding or redesigning hardware. The Bruckert and Clark methods also do not address the need to develop inexpensive peripherals having data transfer rates that are slower than a CPU.
Some methods use a limit switch scheme to keep a data buffer at an optimum level. U.S. Pat. No. 4,258,418 to Heath is entitled, “Variable Capacity Data Buffer System.” The Heath system includes a data storage device disposed between a processor and an I/O device. A circuit establishes a threshold storage capacity for the data storage device and maintains temporary data storage in the device at this desirable threshold level. U.S. Pat. No. 5,117,486 to Kodama et al. is directed to a data transfer method for a buffer that prevents the buffer from emptying or overflowing by using a temporary holding circuit. The Heath and Kodama methods are directed to keeping a buffer from overflowing, not to maximizing data throughput. Keeping a data buffer at a predetermined level may be completely independent of optimizing a data transfer rate between a CPU and a peripheral. The optimum data transfer rate may occur without any buffer.
A combination of dividing and scheduling tasks has also been used to manage different I/O rates. U.S. Pat. Nos. 6,012,136 and 6,029,239 to Brown are directed to communications systems with a configurable data transfer architecture. The Brown system architecture includes an operating system and data protocol that allow tasks to be efficiently partitioned and scheduled. This prevents a memory access bottleneck in devices with embedded digital signal processors. The Brown systems rely on a theoretical model for configuring the architecture. This may not give an optimum data transfer rate for specific hardware.