The present invention relates to a data processing system having at least two logical devices and, more particularly, to an improved memory access management mechanism for controlling data transfer between logical devices in a data processing system.
In the field of data processing, sophisticated high speed computers often incorporate large memories or data storage devices. While the speed of the engines or processors within such computer systems has consistently increased over the years, so too have computer applications continued to demand ever greater speeds.
Among the many variables to be considered in an attempt to increase the performance of data processing systems, two considerations are the speed of the system processor and the speed with which data can be transferred between the main storage and the processor. In general, when two or more logic devices are incorporated in a computer system, one of the devices operates at a slower rate of speed than do the remaining devices. Overall system performance is of course dependent upon the slowest logical device.
As sophisticated computer systems develop, memory storage capacity often increases. Although the operating speed of the smallest components may increase, overall system performance may, in fact, degenerate when memory capacity is extremely large. The speed of a memory device is inversely proportional to the time required to access data stored therein.
Historically, it was common for a processor to communicate with main storage by means of individual connections thereto. The great increase in processing power provided by modern processors, however, resulted in a prodigious amount of data constantly being requested by the processor, exceeding the capacity of the main storage to transfer data to the processor at optimal rates. The size of the memory required for use also increased at a faster rate than that of processor improvement. It would have been uneconomical to continue building non-volatile memories of ever increasing size and speed.
An approach to maximizing performance of a computer system was to develop a temporary memory storage mechanism called a cache. The cache is a relatively high speed memory that tends to be more expensive than conventional data storage devices.
The cache is a limited storage capacity memory that is usually local to the processor and that contains a time-varying subset of the contents of main storage. This subset of data stored in the cache is that data that was recently used by the processor.
The purpose of a cache memory is to reduce cost of a system while minimally affecting the average effective access time for a memory reference. A very high proportion of memory reads can be satisfied out of the high speed cache.
The cache contains a relatively small high speed buffer and control logic situated between two logical devices, such as a processor and main storage. The cache matches the high speed of one of the devices (the processor) to the relatively low speed of the other device (the main storage).
The data most often used is temporarily stored in the high speed buffer. The most recent information requested by one logical device from another logical device is stored in the cache memory simultaneously with its transfer to the first device. Subsequent requests for such information result in the transfer of data directly from the cache to the first device without need for accessing the second device.
When a processor, for example, requests data, a cache first searches its buffer. If the data is stored in the cache, a so-called hit occurs. The data is returned in one cycle. Often, of course, the data sought is not stored in the cache. Consequently, a so-called miss occurs and the cache must retrieve the data from main storage.
A main storage line fetch occurs when the cache accesses data from main storage. A line castout occurs when a line of modified data is returned to main storage from the cache to make room for a new line of data.
As the cache retrieves data from main storage, it often retrieves several more words of data in anticipation that they will soon be requested by the processor. Cache designs include decisions on how much data to fetch for each cache miss and how to decide which data to replace on a miss.
Conventionally, servicing a miss requires many machine cycles. A key factor affecting overall system performance is the amount of time required to service a cache miss (i.e., the amount of time required to move data from main storage to the cache upon issuance of a storage access request). While the amount of time required to perform such a memory access operation is only on the order of milliseconds or even fractions of a millisecond in large computer systems, over a significant period of time these data access operations accumulate and can degrade system performance.
Caches derive their performance from the principle of locality. According to this principle, over short periods of time processor memory references tend to be clustered in both time and space. Data that will be in use in the near future is likely to be currently in use. Similarly, data that will be in use in the near future is located near data currently in use. The degree to which systems exhibit locality determines the benefits of the cache. A cache can contain a small fraction of the data stored in main storage, yet still have hit rates that are extremely high under normal system loads.
A cache generally has an operating cycle of the same length as the processor memory operation microinstruction cycle. For large systems, access time to main storage occurs on the order of 80-120 nanoseconds, whereas access time to cache generally is on the order of 10-20 nanoseconds.
Certain prior art devices divide the cache operating cycle into two or more subcycles dedicated to mutually exclusive operations. U.S. Pat. No. 4,169,284 issued to Hogan, et al., for example, discloses a mechanism by which access to a cache by main storage and by a processor can occur concurrently by means of a cache control which provides two cache access timing cycles during each processor storage request cycle. The cache is accessible to the processor during one of the cache timing cycles and is accessible to main storage during the other cache timing cycle. Thus, each of the machine cycles is divided by two to allow the processor and the main storage to access the cache memory alternately, in equal machine cycle portions.
U.S. Pat. No. 4,439,829 issued to Tsiang discloses a data processing machine in which the cache operating cycle is divided into two subcycles dedicated to mutually exclusive operations. The first subcycle is dedicated to receiving a central processor memory read request with its address and the second subcycle is dedicated to every other kind of cache operation. Such other operations include receiving an address from a peripheral processor after a write to main memory or writing data to the cache after a cache check match condition. After a cache miss, the central processor is stopped to permit updating.
In the general field of logical devices operating at different speeds, "Data Processing System Clock Control" by S. Pitkowsky, et al., IBM Technical Disclosure Bulletin, Vol. 7, No. 9, February 1965, pp. 754-755, discloses a system by which a slower device and a faster device can be made compatible merely by suspending clock operation of the faster device.
In "Ternary Logic Memory Elements" by A. W. Maholick, IBM Technical Disclosure Bulletin, Vol. 26, No. 3A, August 1983, pp. 1196-1197, the ternary equivalence of commonly used memory elements are described. A master-slave flip-flop is set and reset at inverted clock time delayed pulses. The clock pulse is divided into two parts and both clock edges are used.
It would be advantageous to provide a system by which two or more logical devices in a data processing system, all operating at different rates, could be made to transfer data optimally.
It would further be advantageous to shorten data transfer time by reducing the number of machine cycles required to transfer such data.
It would also be advantageous to provide a system for increasing data access performance between two logical devices in a data processing system without increasing the clock system oscillator rate or tranferring data on a half cycle basis.
It would further be advantageous to transfer, during the time interval of a given number of machine cycles, the same amount of data previously transferred during a greater number of such machine cycles.
It would also be advantageous to shorten the time required to transfer data from main storage to high speed buffer or cache.