The present invention relates to an apparatus multiplexed processing of a plurality of requests where access patterns cannot be predicted, such as a disk array system, and especially relates to controlling of reading operand data to be accessed by a processor.
A disk array system reads and stores data corresponding to a plurality of magnetic disk units (hereafter also referred to as hard disk drives). As for the disk array system, the processor usually performs data control and controls the entire system. As typified by the disk array system, a system with a processor uses it to execute a program stored in memory, or in other words, sequentially execute the instruction codes stored in memory. Operand data from memory or registers is used in arithmetic. Usually, the system comprises a processor that performs arithmetic, memory, a memory controller that controls the memory and a plurality of control LSIs.
FIG. 9 is an example of a disk array system. Broadly speaking, a disk array system contains hard disk drives 1020 and a disk array controller 1000 that controls the hard disk drives. The hard disk drives 1020 and disk array controller 1000 are connected by drive IF 1103. Disk array controller 1000 and host computer 1050 are connected by host IF 1102. Disk array controller 1000 contains: channel IF unit 1011 that controls the connection to host computer 1050, disk IF unit 1012 that controls the connection to the hard disk drives, shared memory unit 1015 that contains all shared memory for the entire system, and cache memory unit 1014 that contains the cache memory. Channel IF unit 1011 and shared memory unit 1015 as well as disk IF unit 1012 and shared memory unit 1015 are connected by access path 21137. Channel IF unit 1011 and cache memory unit 1014 as well as disk IF unit 1012 and cache memory unit 1014 are connected by access path 01135 and access path 11136 via selector unit 1013. Channel IF unit 1011 is provided with host IF 1102, microprocessor 1101 (hereafter referred to as processor), SM access controller 1105, CM access controller 1104 and internal bus 1106 that connects them. Disk IF unit 1012 is provided drive IF 1103, processor 1101, SM access controller 1105, CM access controller 1106 and internal bus 1106 that connects them. The channel IF unit and cache memory unit contain CM controller 1107 that controls memory and memory module 1109. The shared memory unit contains SM controller 1108 that controls memory and memory module 1109. The processors in the channel IF unit and disk IF unit process data write and read instructions from the host while recognizing the state of the disk array system by accessing operand data in the memory of the shared memory unit or in the registers of each controller.
With this type of processor system, in addition to the computational performance of the processor, the performance of reading operand data from memory or registers into the processor is important. The delay from when the processor issues an access request until data is received is known as access latency. In recent years, the processor""s core performance has improved, but there has not been much improvement in the performance of accessing and reading operand data that accompanies an external I/O access. Due to these differing performance characteristics, if access latency becomes an issue, the processor will stall, processor performance will deteriorate, and consequently the memory system will create a system-wide bottleneck.
Basically, there are two ways to enhance the operand data access performance. The first is to improve performance by reducing access time, and the second is to conceal the access time. However, in order to reduce the access time, it is necessary to increase the operating frequency of the access path. This results in package noise problems such as cross talk, and makes improvement difficult. Especially, with a disk array system as shown in FIG. 9, several LSIs lie between the processor and the operand data, and the system is constructed with a long distance between the processor and operand data. Consequently, it is difficult to reduce the access time below a specific value. The xe2x80x9cread aheadxe2x80x9d of data can be given as an example of the second method, to conceal the access time. One conventional example in which the processor uses a dedicated instruction for read ahead is the debt (Data Cache Block Touch) instruction of the PowerPC instruction set, listed in the xe2x80x9cPowerPC Microprocessor Family Programming Environment.xe2x80x9d The debt instruction is a dedicated instruction that reads operand data into the processor""s internal cache.
However, when using a dedicated instruction, because in some cases an external I/O access time on the order of microseconds is required with a large scale system such as aforementioned disk array system, it may be impossible to verify that the data is in the cache at the point in time when the data is actually required. Further, because some cache memories are occupied for a time on the xe2x80x9corder of microsecondsxe2x80x9d, the execution of a plurality of read ahead instructions will decrease the usage efficiency of the cache. With the PowerPC, the debt instruction is effective for the main memory, but cannot be executed for external I/O. In addition, the relatively inexpensive embedded-processors used for so-called embedded applications are not provided with this type of dedicated instruction.
As has been described above, with the increased speed of processors in recent years, the relative performance of operand access that accompanies an external I/O access, typically an access of external memory or external registers, has decreased. Consequently, this creates a bottleneck for system performance. In other words, the internal processing performance of a processor increases with increased operating frequency of the processor core unit, but on the other hand, the external I/O access speed is insufficient. Therefore, the performance of a system that issues a plurality of external I/O accesses, such as an embedded type system, depends upon the performance of the external l/O access.
The main problem the present invention intends to resolve is improvement of the operand access performance. Especially for external I/O control, the object is to inexpensively and easily realize improved operand data access performance of the processors.
One factor causing the aforementioned problems common to operand data access, is that for the operand access that accompanies conventional external I/O, there is no operation until after an external I/O request is generated. Improvement of the speed at which operand data is read requires reduction in the access latency, the time from when the processor issues a request to read from the memory or a register until there is a response, and external IF must be made high-speed. High-speed memory such as high-speed SRAM or dedicated memory for each processor, is expensive and therefore leads to higher priced systems. Further, it is difficult to achieve a large reduction in the operand access time.
Another problem due to increased access latency is as the occupancy percentage of the system bus increases, the effective performance of the system bus decreases.
In short, the present invention resolves the above problems by reading (hereafter referred to as reading ahead) operand data in advance from the memory or registers into a register within the external access controller, before operand data is required in the external access controller.
The specific is described below.
A read ahead controller is provided in the external access control LSI that controls external access of the processor.
The read ahead controller is provided with an access control circuit that controls read ahead by using a read ahead register circuit comprising: one or more pre-fetch register sets provided with an address register that specifies the memory or register address that will be pre-fetched, an address register valid flag that indicates validity of data in said address register, a data register that stores pre-fetched data, and a data register valid flag that indicates validity of data in said data register; and an address checker circuit that evaluates whether the address of the access destination matches the value in said address register.
The read ahead controller operates such that upon detecting a write access to the read ahead register, stores the data of said write access in the address register and sets the address register valid bit. Further, the controller performs a read access at the address indicated by the data stored in said address register, stores the read data in the data register and sets the data valid flag. In the case where the valid bit of said address register is set and a read access is detected at the address matching the data stored in said address register, if said data valid flag has already been set, the data stored in the data register is transmitted immediately. If said data valid flag has not been set, after said data valid flag is set, the data stored in said data register is transmitted, and said address valid flag and said data valid flag are reset.
If a write access occurs at the address stored in the address register of the read ahead register, the read ahead controller sets the data of said write access in the data register and sets the data register valid flag.
If the read ahead controller detects a write access to the read ahead register, and if all address register valid flags of the pre-fetch register set have been set, read ahead will not be performed.