The present invention relates to an information processing system including a CPU and memories, and more particularly addressing the main memory where a cache is used.
References cited herein are listed below, and they will be referred to hereinafter by the respective reference numbers. Reference 1 (Ref. 1) corresponds to the following article: Nikkei Microdevices, February 1998, pp. 134-141 (in Japanese), and Reference 2 (Ref. 2) corresponds to the following article: David A. Patterson and John L. Hennessy, Computer Architecture, A Quantitative Approach, Second Edition, Morgan Kaufmann Publisher Inc., (1996), pp. 375-384.
Reference 1 discloses an example of address mapping in a DRAM on p. 141. In this example, two chips of four-bank 64-Mbit DRAMs are used. If, in this example, accesses to 64-bit consecutive addresses are assigned in the order of column, row, device and bank from the lowest position upward as illustrated in FIG. C(a) of that reference, 16-MB data can be stored continuously on the two memory banks of the first chip and the second chip. FIG. C(b) of the same reference illustrates an in instance in which the assignment is made in the order of column, bank, device and row from the lowest position upward. It is stated that this assignment results in the storage of data to distribute accesses among the eight banks.
Before filing this application, the present applicant studied address mapping which would take account of relationships between a central processing unit (CPU), a cache and a main memory constituting an actual information processing system. As a result, it is found that address conversion(address mapping) should be determined by taking account of the relationship between the cache and the main memory. This is because the addresses issued by the CPU are transferred to the main memory when required data are not found in the cache.
FIG. 2 is a diagram illustrating address management by the cache, which was studied before filing this application. In this diagram, which is cited from p. 378 of (Ref. 2), a physical address is divided into areas for management by the cache. The cache broadly divides each physical address into two areas, a block offset and a block address. The block address is an address for each block offset. Some caches, known as direct-mapped caches and set-associative caches, use a management system under which a block address is further divided into a trailing part known as an index and a leading part known as a tag.
FIG. 3, cited from p. 381 of reference 2, illustrates how the cache manages addresses and data. Herein, xe2x80x9cCMxe2x80x9d stands for cache, in this case a direct-mapped cache of which the capacity is 8 KB and the block offset is 32 B. Numbers 1-4 with a circle in FIG. 3 will be denoted by numbers 1-4 with ( ) and * in this specification, for example (*1) is for 1 with a circle. First will be described a case in which, in a request access from the CPU to the memory system, a read access has hit the cache. A request address from the CPU is transmitted to the cache via address lines ((*1) in FIG. 3). After that, on the basis of index information, which is part of the request address, the number by which it is entered in the cache is determined. Since a direct-mapped cache is taken up as an example here, the total number of index addresses (in this case 2 to the eighth power, or 256) is identical with the number of entries in the cache. Accordingly, the matching in this case is determined by one-to-one correspondence ((*2) in FIG. 3). After an entry number in the cache is selected on the basis of index information of the address, the tag stored in the entry indicated by that entry number is compared with the tag of the request address ((*3) in FIG. 3). This is accomplished only when the entry in the cache is valid (confirmed by xe2x80x9cvalidxe2x80x9d indicating a valid bit). If the tag of the request address and the tag entered in the cache are found identical, the address of the block offset will be utilized to transmit desired 8-byte data to the CPU by a 4:1 multiplexer ((*4) in FIG. 3). Or, on the contrary, if the request address and the address of the data held by the cache are not identical, the main memory will be accessed.
Next will be described how writing into the memory system is accomplished. The description here will refer to a case in which a write access hits a cache of a write-back type (to be explained below). When a write request is generated by the CPU, the request address is communicated to the cache as in the aforementioned case of read access. After that, the above-described procedure is taken to judge whether or not the request address is identical with the address of any of the data held by the cache. Hereupon, if the tag of the request address is found identical with any tag held by the cache, the pertinent data held by the cache are altered, and a dirty bit (not shown) indicating nonidentity with any data in the main memory is set. As the CPU has only to update the contents of the cache and can continue processing, high speed accessing is made possible.
Now will be described a case in which, the contents of the cache having varied in this way, the next access is a cache miss. In this case, the contents of the cache should be replaced with newly requested contents. This is because of the utilization of the corollary of locality (ie., recently written contents are more likely to be used again). Whereas the old contents of the cache are written back into the main memory, this replacement of the contents of the cache is known as write back, and cache memories of this type are called write-back type cache memories. Since the cache manages data by the index section, which is a part of an address, the address replaced here has the same index section as the request address and differs in the tag section (the block offset is wholly replaced, and this is known as cache replacement).
Considering such operations of the cache, in accessing the main memory, two instances should be taken into account, i.e. accessing which utilizes the corollary of locality and accessing an address with the same index but a different tag in a writing-back operation. In the case of reference 1, while accessing in the first instance dependent on the corollary of locality can be accomplished at high speed because the access is diverted to a different bank, but no consideration is given to accessing according to the second instance, i.e. accessing at the time of writing back. Thus once access to a different word line on the same bank is necessitated by writing back (bank conflict), high speed accessing is made difficult. Consequently, this creates a problem in executing an application (program) involving frequent writing back.
An object of the present invention, therefore, is to avoid bank conflicts, where a DRAM is to be used as the main memory of an information processing apparatus having a cache, by utilizing both accessing according to the corollary of locality and accessing at the time of writing back and directing these accesses to different banks of the DRAM with a high probability. Another object of the invention is to increase the speed of accessing immediately following write-back processing.
A typical procedure according to the invention is as follows. When assigning request addresses from a CPU to different banks of a DRAM, bank addresses in the DRAM are generated by operation on the index section and the tag section of each bank address so that local accesses and write-back accesses can be assigned to different banks. More specifically, there is provided an address mapping circuit for generating bank addresses in the DRAM by performing operation on the index section and the tag section of each request address issued by the CPU. A typical operation for generating the index section and the tag section is addition.