Processor storage arrays, ranging from smaller high-speed caches to large, comparatively low-speed random access memories (RAM), are commonly organized on an n-word basis where a word is some number of consecutive bytes representing the basic unit of computation for a processor, and n is a positive integer. A storage access, by definition, references one n-word. This organization allows for efficient busing structures between the processor and storage and simplifies addressing the storage arrays, which may be composed of many individually-addressable storage arrays. For example, in the S/370 architecture, as illustrated by the IBM Corporation publication entitled "The ESA/370 CPU Architecture", published 1989, SA22-7200, the 8-bit byte is the smallest unit of addressable storage, and 4 consecutive bytes constitute a word--the basic unit of computation. Present day S/370 processors have storage arrays organized on 2-word (doubleword or DW) and 4-word (quadword or QW) boundaries.
One example of a possible S/370 processor development dealing with operands which are not aligned on n-word boundaries is the operand fetch logic illustrated by U.S. Pat. No. 4,189,772 to Liptay issued Feb. 19, 1980 entitled "Operand Alignment Controls for VFL Instructions".
Another patent in the general area is U.S. Pat. No. 3,602,896, issued Aug. 31, 1971 to D. Zaheb entitled "Random Access Memory with Flexible Data Boundaries" which discloses a random access memory where an accessed data word may overlap one memory word boundary into an adjacent memory word. The initial byte location is provided along with a number of bytes (up to one word length).The partitioning of the cache required by the disclosure imposes unacceptable circuit delay in the cache access critical path.
Yet another patent in the general area is U.S. Pat. No. 4,435,792 issued Mar. 6, 1984 to Bechtolsheim entitled "Raster Memory Manipulation Apparatus" wherein a computer can access memory over word boundaries. A shifter and offset data (i.e. length of access and boundary) are used to align the data, but again the partitioning necessitates placing an incrementer, multiplexer, and decoder in the main memory address path which imposes unacceptable delays.
These kinds of additions in the main memory address path are imposed in U.S. Pat. No. 4,520,439 issued May 28, 1985 to Liepa about "Variable Field Partial Write Data Merge" which discloses accessing memory and crossing over word boundaries by providing a starting address, read/write information, start location and access length. Words across word boundaries are merged with not needed bits being masked using a write data interface. This bit masking approach is unrelated to our work.
U.S. Pat. No. 4,814,976 issued Mar. 21, 1989 disclosed a "RISC Computer with Unaligned Reference Handling and Method for the Same" wherein is shown accessing across boundaries of a cache memory using a shift/merge unit which requires explicit coding in order to handle off-boundary accesses, while U.S. Pat. No. 4,814,553 issued Sep. 19, 1989 to Kawamata related to a "Raster Operation Device" which shows a way of crossing word boundaries based on shift width, bit width of data of a raster screen display. No provision here was made for data fetching, retaining a residue of the second word accessed in the read-modify-write operation, and other features required for cross storage boundary accesses of a computer memory.
Other art in the general field but thought unrelated to our own developments includes U.S. Pat. No. 4,449,185, issued May 15, 1984 to Oberman et al, which related to the "Implementation of Instructions for a Branch which can Cross One Page Boundary"; U.S. Pat. No. 4,502,115 issued Feb. 26, 1985 to Eguchi which related to a "Data Processing Unit of a Microprogram Control System for Variable Length Data"; U.S. Pat. No. 4,888,687 issued Dec. 19, 1989 to Allison et al relating to a "Memory Control System" which is not directed to high speed accesses which can handle block concurrent stores and accesses.
Within IBM, as shown by the IBM Technical Disclosure Bulletin, Vol. 25 No. 7A, December 1982, A. Y. Ngai and C. H. Ngai proposed "Boundary Crossing with a Cache Line". The Ngais' publication included a byte shifter for data alignment, pp. 3540. This technical disclosure facilitates cross-boundary fetching by partitioning the cache memory into two segments, A and B, which are basically even and odd addressed arrays. This partitioning necessitates placing an incrementer and multiplexer on the segment A cache address and multiplexers on the outputs of the cache arrays. As we have said, such partitioning runs counter to our developments since additional circuit delay is added to the cache critical path.
"Mark Bit Generator" was a topic covered in another TDB, Vol. 20. No. 9, of February 1978, by C. D. Holtz and K. J. Parchinski; while also in the data storage general field, the TDB included the item "Storage Byte Mark Decode With Boundary Recognition" by L. J. LaBalbo, W. L. Mostowy and A. J. Ruane Vol. 29 No. 12, May 1987, p. 5264 and the item by G. F. Grohoski and C. R. Moore entitled "Cache Organization to Maximize Fetch Bandwidth" in Vol. 32 No. 2 in July 1989, p. 62.
Other internal IBM developments which dealt with cross-boundary buffers, in addition to the Ngai publication, could be cited as a product called "RACETRACK 11" which was proposed and as illustrated by U.S. Ser. No. 07/291,510, filed Dec. 29, 1988, now abandoned, entitled "Hardware Implementation of Complex Data Transfer Instructions", p. 24. This machine prototype was provided for the LM (Load Multiple) instruction a register for storing the entire doubleword called a cross-boundary buffer (20-66 in that application) which effected a save of the data destined for a general purpose register (GPR). A mask could be set with the data saved in this cross-boundary buffer and later used for merging with fetched data. For the LM instruction, the cross-boundary buffer was controlled by a combination of "mini-instructions" and a baroque hardware control mechanism to handle various circumstances of GPR loading and storage boundary alignment. Alternatively, the cross-boundary buffer could be controlled by microcode for microcoded execution of instructions with multi-word storage operands. Looping controls were provided to execute microwords repeatedly until the storage operand was consumed; however, if the length of the storage operand in doublewords was not an integral of the number of storage read microwords in the loop, machine cycles were wasted issuing nullified read microwords. Also provided for the STM (Store Multiple) instruction was a register for storing the entire doubleword called a save register (pp. 28-29) which effected a save of the data destined for storage. A mask could be set with the data saved in this save register and later used for merging with data fetched from a GPR and destined for storage. Controls for STM were provided by means analogous to those for LM. The save register could not be controlled by storage write microwords and was therefore limited in use to the STM instruction. A corresponding European Patent Application has been published as of the date of filing of the present application, claiming U.S. Ser. No. 07/291,510, filed Dec. 29, 1988 as a priority document.
Generally in a data processing system where the processor can access a memory that is organized on multi-word boundaries, the storage address is sent to memory along with the kind of access (read or write, and length of access). A doubleword memory organization is used in many systems.
In S/370 and similar architectures, the disparity between the smallest unit of addressable storage (a byte) and the basic unit of computation (the 4-byte word) on which the storage organization is based gives rise to the cross-boundary storage access phenomenon. A cross-boundary storage access requires two n-words to be accessed to complete the storage reference, and therefore takes twice the amount of time to process as a non-cross-boundary or on-boundary access. These problems give rise to other possibilities, some examples of which are contained in the detailed description of our inventions, to provide a further background to the developments which we have achieved.