1. Field of the Invention
The present invention relates to a memory LSI with arithmetic logic processing capability which is used in a processor system, and a main memory system using the same, and a method of controlling this main memory system.
2. Description of the Related Art
A computer or computer system such as a personal computer and a work station is generally called a processor system. Such a conventional processor system is described in detail in, for example, "Computer Architecture: A Quantitative Approach", by John L. Hennessy and David A. Patterson, (published from Morgan Kaufmann Publishers Inc.) and "Design of Microprocessor-Based Systems", by Nikitas Alexandridis, (published from the Prentice Hall), and so on.
FIG. 1 shows the general structure of the conventional processor system. Referring to FIG. 1, a processor system is composed of a processor 31 including a primary cache memory 35, a system controller 32, a secondary cache memory 34, a main memory system 9, an I/O subsystem 33.
The processor 31 is usually realized as a microprocessor which is integrated on an LSI. The system controller 32 controls the main memory system 9, the secondary cache memory 34 in response to a main memory access and the I/O subsystem 33 in response to an I/O access from the processor 31. Also, the system controller 32 transfers an interrupt request from the I/O subsystem 33 to the processor 31.
The processor 31 and the system controller 32 are connected by a control signal line 36-1, an address signal line 36-2, and a data signal line 36-3. Also, the system controller 32 and the main memory system 9 are connected by a memory bus 16. In the field of personal computer, the system controller 32 is realized by a plurality of different LSIs. Accordingly, it is generally called a chip set or a peripheral chip set.
FIG. 2 shows a first example of the structure of the conventional main memory system 9. The main memory system 9 is the system which stores an input data used to perform arithmetic logic processing by the processor system, a data during the arithmetic logic processing, an output data of the arithmetic logic processing, a program used to perform the arithmetic logic processing, and so on. A main memory space read/write operation to a memory data in the main memory system 9 is executed by issuing a load/store instruction from the processor 31.
Referring to FIG. 2, the main memory system 9 is composed of plurality of DRAM LSIs 11 and each of the DRAM LSIs 11 contains a memory section 13 which is composed of a DRAM cell array, a sense amplifier, a decoder, and so on. The respective DRAM LSIs 11 are connected to a memory bus 16 composed of a control signal line 16-1, an address signal line 16-2, and a data signal line 16-3. As shown in FIG. 1, the memory bus 16 is used for connection between the main memory system 9 and the system controller 32. The data signal line 16-3 is a bi-directional signal line because it used for both of a read data and a write data.
Also, in order to widen a data band width of the main memory system 9, i.e., a bus band width of the memory bus 16, the data signal line 16-3 has the bit width which is wider than the number of data input/output terminals of each DRAM LSI 11. Accordingly, the structure is generally constructed such that a part of the data signal line 16-3 is connected to each DRAM LSI 11. For example, a structure is often adopted in which eight DRAM LSIs 11 each having the data input/output terminals of 16 bits are used to be connected to the data signal line 16-3 which has a 128-bit band width. As the DRAM LSI 11 which is used in the structure of such a main memory system 9, there is known, for example, a fast page mode DRAM, an extended data out (EDO) DRAM, a synchronous DRAM, and so on.
FIG. 3 shows a second example of the structure of the conventional main memory system 9. Referring to FIG. 3, a memory bus 16 is connected to a memory bus 16 composed of a control signal line 16-1 and a bi-directional data/address signal line 16-4. In this case, the data/address signal line 16-4 is provided to have the same bit width as the bit width of the data/address input/output terminals of each DRAM LSI 11, different from the data signal line 16-3 in the structure example shown in FIG. 2.
Such a structure is devised to solve a problem that a lot of DRAM LSIs 11 must be used for the main memory system 9 in order to widen the memory bus band width in the structure of the main memory system 9 shown in FIG. 2. As the DRAM LSI 11 used in the structure of such a main memory system 9, there are known, for example, a Rambus DRAM and so on.
The structure of the main memory system 9 is aimed at reducing the number of signal lines which constitutes the memory bus 16 and the number of input/output terminals of the DRAM LSI 11. At the same time, the structure is aimed at increasing a bus band width by driving the signal line at high speed. In this case, because the problems such as noise generation due to high-speed drive and distribution of delay times on the signal lines can be reduced by decreasing the number of signal lines, such high-speed drive is made possible.
In the structure of the main memory system 9 shown in FIG. 2, a memory bus band width is provided by arranging the DRAM LSIs 11 in parallel. Accordingly, there is a problem in that a lot of DRAM LSIs 11 must be used for the main memory system 9 to widen the memory bus band width. On the other hand, in the structure of the main memory system 9 shown in FIG. 3, because the memory bus band width is provided by driving the memory bus 16 at high speed, such a problem does not occur. As the DRAM LSI 11 used in the structure of the main memory system 9 shown in FIG. 3, there are known, for example, a Rambus DRAM and so on.
In the Rambus DRAM, specific technique is developed about structuring method and driving method of the memory bus 16 in order to realize the high-speed drive of the memory bus 16. However, because they are not in relation with the present invention, the description is omitted here. Note that the Rambus DRAM is described in detail in Rambus technology guide published from Rambus in USA.
On the other hand, the method is proposed of producing an LSI in which a memory, especially, a DRAM and an arithmetic logic processing circuit are merged on an LSI chip so that a type of arithmetic logic processing can be executed using the memory or DRAM in the chip. Such technique is generally called merged logic-DRAM technique. A typical example of the conventional technique of the merged logic-DRAM technique is described in "A Multimedia 32b RISC Microprocessor with 16 Mb DRAM", by Toru Shimizu et al., (1996 IEEE International Solid-State Circuits Conference, pp. 216 to 217), or "A 7.68 GIPS, 3.84 GB/s, 1 W, Parallel Image-Processing RAM Integrating a 16 Mb DRAM and 128 Processors", by Yoshiharu Aimoto et al. (1996 IEEE International Solid-State Circuits Conference, pp. 372 to 373). These are called merged logic-DRAM conventional technique 1 and merged logic-DRAM conventional technique 2 in the following description, respectively.
In the merged logic-DRAM conventional technique 1, a processor and a part of the main memory system is installed in one LSI chip. The LSI occupies a position of the processor 31 in the processor system shown in FIG. 1. The LSI has an advantage that it does not need any main memory system 9 to be installed outside it at all, when the main memory capacity is sufficient only with the DRAM in the chip.
On the other hand, in the merged logic-DRAM conventional technique 2, parallel processors dedicated to image processing and a DRAM for supplying the parallel processors with an image data are installed into one LSI chip. The LSI occupies a position of the I/O subsystem 33 in the processor system shown in FIG. 1, and it has a function to perform only the image processing at high speed in the I/O subsystem 33.
However, in the conventional technique on the main memory system as described above, there is a problem in that the provision of a necessary memory bus band width is difficult. When it is not possible to provide a sufficient memory bus band width, the effective performance of the processor system is limited by the insufficient memory bus band width, even if the processor has high performance. Also, there is another problem in that the conventional technique on the merged logic-DRAM technique as described above is not an effective solution to the problem on the provision of the memory bus band width of the main memory system. Hereinafter, these problems will be described.
Generally, it is known that the processing capability of the processor 31 is proportional to the data band width of the memory bus 16 of the main memory system 9, that is, the memory bus band width which is required to fully draw out the processing capability. This is because the number of times of access to the main memory system required in the processing of whole of an arbitrary program is determined. If the processing should be executed at higher speed, it is necessary to execute the access to the main memory more times per a unit time. As the semiconductor technology develops, the processing capability of the processor 31 continues improvement in a geometric series manner. However, it is very difficult to provide the memory bus band width so as to correspond to such performance improvement in processing capability.
One of the reasons is that the memory bus 16 is wirings for connecting between the plurality of LSIs on a print circuit board. Therefore, a load capacitor per a wire operation is large so that it is difficult to perform the high-speed operation compared to an internal wiring of the LSI. Also, another reason is that an internal circuit of the LSI is connected through external I/O terminals of the LSI to the memory bus 16 which is wirings on the print circuit board. Therefore, the number of signal lines of the memory bus is limited, compared to the internal wirings of the LSI. In this way, it is a very difficult problem to provide a necessary memory bus band width from the viewpoint of the operation speed of the signal lines of the memory bus 16 and from the viewpoint of the number of the signal lines.
Generally, in case of attempting to increase the data transfer band width between two circuit blocks, the most effective method is to install or merge these circuit blocks on one LSI chip. This is because, inside the LSI, it is possible to expect substantial improvement in both of the operation speed of the signal lines and the number of signal lines, compared to the wiring on the print circuit board. In accordance with, the merged logic-DRAM technique has a possibility to provide a solution in the point of the provision of the memory bus band width of the main memory system 9 as described above. However, because the conventional merged logic-DRAM technique is applied to the inside of the processor 31 or the I/O subsystem 33, it is not a satisfactory solution in the point of the improvement of the memory bus band width of the main memory system 9. This is based on the following reason.
The above-mentioned merged logic-DRAM conventional technique 1 is an effective solution in point of the provision of a memory bus band width, if the capacity of a DRAM merged with the processor 31 is larger than the capacity originally required for the processor 31 or an application program which runs on the processor 31.
However, it is extremely important that the main memory system 9 has extension possibility of the memory capacity. The necessary memory capacity is often larger than the memory capacity of a DRAM which can be merged in the LSI chip. The reason why the extension possibility is necessary is that it is important from the viewpoint of cost that the main memory systems 9 having various memory capacities can be supported because the necessary memory capacity depends on a kind of application. Also, the necessary memory capacity of the main memory system 9 is, for example, from about 16 megabytes to about 256 megabytes. It is larger than the memory capacity of a DRAM which can be merged in the LSI chip. For this reason, it is very difficult to provide the necessary memory bus band width between the processor 31 and an external main memory system 9, when the external main memory system 9 must be connected to the processor 31 which is based on the merged logic-DRAM conventional technique 1.
On the other hand, the merged logic-DRAM conventional technique 2 is the technique which can utilize a high band width data transfer only when specific processing is performed in the I/O subsystem 33. However, the merged logic-DRAM conventional technique 2 is not possible to become a solution in point of provision of the memory bus band width of the main memory system 9. If the specific processing is executed in The I/O subsystem 33 instead of the processor 31, it is possible to reduce the load of the processor 31 and the memory bus band width which is required with the load as the secondary effect. However, there is also a problem in that the superior performance of the processor 31 continuously developed can not be fully utilized.
This is because such a method only means transfer of the processing to be performed by the processor 31 to the I/O subsystem 33. Also, there is a problem in the point of the extendibility of the memory capacity, like the above-mentioned conventional technique 1. This is because a high band width data transfer is not possible, when a memory other than the DRAM within the I/O subsystem 33 which is provided based on the merged logic-DRAM conventional technique 2 is accessed.