The invention relates to a data processor having a translation lookaside buffer and, more particularly, a data processing system using such a data processor. For example, the invention relates to a technique which is effective when it is applied to the realization of a high data processing speed.
In a virtual storage system, a virtual memory space which is sufficiently larger than a physical memory is prepared and a process is mapped into the virtual memory space. Now, xe2x80x9cprocessxe2x80x9d means a program which is being executed under management of an OS (Operating System). It is, therefore, sufficient to consider only the operation on a virtual memory as for the process. A MMU (Memory Management Unit) is used for mapping from the virtual memory to the physical memory. The MMU is usually managed by the OS (Operating System) and exchanges the physical memory so that the virtual memory which is needed by the process can be mapped into the physical memory. The exchange of the physical memory is performed between the MMU and a secondary storage or the like. The MMU generally also has a function to protect the storage so that a certain process doesn""t erroneously access a physical memory of another process.
When an address translation from an address (virtual address) in the virtual memory to an address (physical address) in the physical memory is performed by using the MMU, there is a case where the address translation information is not registered in the MMU or a virtual memory of another process is erroneously accessed. In this instance, the MMU generates an exception, changes the mapping of the physical memory, and registers new address translation information.
Although the function of the MMU can be realized even by only software, if the translation is performed by software each time the process accesses to the physical memory, the efficiency thereof is low. To prevent it, a translation lookaside buffer for address translation is prepared on the hardware and address translation information which is frequently used is stored in the translation lookaside buffer. That is, the translation lookaside buffer is constructed as a cache memory for the address translation information. A different point from an ordinary cache memory is that when the address translation fails, the exchange of the address translation information is performed mainly in dependence on software.
Various cache memories are widely used to realize a high speed of data and instruction access.
The present inventors have examined the translation lookaside buffer and cache memory from a viewpoint of realizing a high speed of the memory access. As a processor to divide the translation lookaside buffer into a buffer for an instruction and a buffer for data, for example, there is a processor disclosed in PowerPC 603 RISC Microprocessor User""s Manual (MOTOROLA, 1994). The processor further individually has a data cache memory and an instruction cache memory. At pages 7 to 15 of this literature, it will be understood that an instruction TLB miss and a data TLB miss are separately treated in the PowerPC. According to the examination of the present inventors, even if the translation lookaside buffers are separately provided, since there is no interrelation between them, if the address translation fails, necessary address translation information has to be obtained from an external memory and it has been found that there is a limitation in realization of a high memory accessing speed.
As for the cache memory, when a cache miss occurs, a cache entry is newly read out from the external memory by only an amount of one entry. In this instance, if there is no invalid cache entry, a valid cache entry is swept out from the cache memory in accordance with a logic such as LRU (Least Recently Used) or the like. The cache entry which was swept out as mentioned above may include data or instruction to be subsequently used. Therefore, it is desirable that an instruction to specify a processing routine such that a high speed or the like is required is always held in the cache memory. In such a case, it is also considered to enable the cache memory to be used as a random access memory. However, if all of the areas in the cache memory are constructed as mentioned above, all of the functions as a cache memory are lost, so that a case where an inconvenience is caused in dependence on an application is also presumed.
It is an object of the invention to provide a data processor which can realize a high memory accessing speed. In more detail, it is an object to provide a technique for realizing a high memory accessing speed from a viewpoint of address translation and to provide a technique for realizing a high memory accessing speed from a viewpoint of a cache memory.
The above and other objects and novel features of the present invention will be clarified from the description of the specification and the annexed drawings.
An outline of a typical invention among the inventions disclosed in the present invention will be briefly described as follows.
That is, according to a first aspect of the invention, a translation lookaside buffer is separately used for data and for an instruction, address translation information for instruction is also stored into the translation lookaside buffer for data, and when a translation miss occurs in the translation lookaside buffer for instruction, new address translation information is fetched from the translation lookaside buffer for data.
In detail, a data processor (1) comprises: a central processing unit (2); a first translation lookaside buffer (4) in which a part of address translation information to translate a virtual address that is treated by the central processing unit into a physical address is stored and which association-retrieves, from the address translation information, a physical address corresponding to the virtual address that is outputted by the central processing unit; and a second translation lookaside buffer (3) in which address translation information regarding an instruction address in address translation information possessed by the first translation lookaside buffer is stored and which association-retrieves, from the address translation information, a physical address corresponding to the virtual address that is outputted by the central processing unit upon instruction fetching, when a result of the associative retrieval indicates a retrieval miss, association-retrieves the first translation lookaside buffer by a virtual address according to the retrieval miss, and obtains the address translation information retrieved by the associative retrieval.
Another data processor according to such an aspect comprises: a central processing unit; a first translation lookaside buffer in which a part of address translation information to translate a virtual address that is treated by the central processing unit into a physical address is stored and which associatively retrieves, from the address translation information, a physical page number corresponding to a virtual page number that is outputted by the central processing unit; a second translation lookaside buffer in which address translation information regarding an instruction address in address translation information possessed by the first translation lookaside buffer is stored and which associatively retrieves, from the address translation information, a physical page number corresponding to the virtual page number that is outputted by the central processing unit upon instruction fetching; and a buffer control circuit (320) for, when a result of the associative retrieval by the second translation lookaside buffer indicates a retrieval miss, associatively retrieving the first translation lookaside buffer by a virtual page number according to the retrieval miss, and for supplying the address translation information retrieved by the associative retrieval to the second translation lookaside buffer.
According to the above means, when the translation miss occurs in the translation lookaside buffer for instruction, since the new address translation information is fetched from the translation lookaside buffer for data. Therefore, a high speed of the address translating operation can be realized as compared with a case of obtaining the address translation information from an external address translation table every time at the time of the translation miss. Thus, a high memory accessing speed is accomplished. Particularly, a reason why the translating speed of the instruction address is made high is because an operand fetch is performed in accordance with a decoding result of the fetched instruction or because a capacity of the translation lookaside buffer for instruction is reduced (the number of entries is small) as compared with that of the translation lookaside buffer for data.
When the result of the associative retrieval by the second translation lookaside buffer indicates the retrieval miss and the result of the associative retrieval of the first translation lookaside buffer by the virtual page number according to the retrieval miss indicates the retrieval miss, the central processing unit reads out the address translation information including the virtual page number according to the retrieval miss from an external memory provided out of the data processor by an exceptional process and writes the read-out address translation information into the first translation lookaside buffer. After completion of the exceptional process, the interrupted address translating operation is continued.
According to another aspect of the invention, only a partial area in the cache memory is selectively made operative as a random access memory. In other words, the cache function is suppressed for only the partial area.
In detail, the data processor further comprises a data cache memory (6) in which a cache entry of data is stored in correspondence to the physical page number and to which the physical page number which was associatively retrieved by the first translation lookaside buffer is supplied and which associatively retrieves the cache entry corresponding to the physical page number. In this instance, a part of the data cache memory is mapped into a predetermined area (E1) that is specified by the virtual address. The data processor further comprises first RAM area discrimination control means (605) for detecting the access to the predetermined area and allowing the data cache memory to perform a random accessing operation.
The data processor further includes an instruction cache memory (5) in which a cache entry of an instruction is stored in correspondence to the physical page number and to which the physical page number which is associatively retrieved by the second translation lookaside buffer is supplied and which associatively retrieves a cache entry corresponding to the physical page number. In this instance, a part of the instruction cache memory is mapped into the predetermined area (E1) that is specified by the virtual address. The data processor further comprises second RAM area discrimination control means (505) for detecting the access to the predetermined area and for allowing the instruction cache memory to perform a random accessing operation.
According to the above means, the predetermined areas in the data cache memory and the instruction cache memory are accessed at random and the remaining areas in both of the cache memories are made operative as cache memories to be associatively retrieved. Therefore, particularly, a condition that desired instruction and data which need a high accessing speed are always held in the cache memory and a condition that the instruction and data used recently are held in the cache memory can be satisfied. It contributes to the improvement of a data processing speed.
According to still another aspect of the invention, as an index address to select a cache line of the cache memory, a bit position of the virtual address is switched to an upper bit position than that in the ordinary operation. Thus, the cache memory is divided every large address space and is allocated to a virtual memory space.
In more detail, index mode designating means (630) for selectively using a bit on the upper side of the virtual address for the selection of the cache line of the data cache memory is further provided.
Index mode designating means (530) for selectively using a bit on the upper side of the virtual address for the selection of the cache line of the instruction cache memory is further provided.
According to the above means, since the bit on the upper side of the virtual address can be used for an index of the cache. Therefore, the cache memory is divided every large address space and can be allocated to the virtual memory space. Thus, the cache of a direct mapping can be falsely treated as a set-associative cache. The invention can contributes to the improvement of the data processing speed.
Further another aspect of the invention is to improve a use efficiency of the data processor.
First, an I/O register area (I/O register space) is mapped from a virtual address space (address space on the virtual memory) to a physical address space (address space on the physical memory). That is, there is further provided detecting means (606) for inputting the physical page number which is outputted by an associated hit by the associative retrieval due to the first translation lookaside buffer, for detecting whether the inputted physical page number coincides with the page number allocated to the I/O register space in the data processor or not, for suppressing the associative retrieving operation of the data cache memory by the detection of the coincidence, and for allowing the I/O register to be directly accessed. In this instance, the translation information which is stored into the first translation lookaside buffer has protection information to specify an access privilege to a page and there is provided access protecting means (405) for discriminating an access privilege for the relevant page on the basis of the protection information of translation information according to the associated hit. Thus, the storage protection can be also performed for the I/O register space.
Second, the translation information which is stored into the first translation lookaside buffer has cache write mode specified information (WT) for specifying which one of write-through and write-back is used as a write control mode for the data cache memory, and there is provided cache write control means (614) for controlling a cache write mode for the relevant page on the basis of the cache write mode information included in the translation information regarding the associated hit. In the write-through mode, the writing operation is performed to both of the cache memory and the external memory in case of a cache hit and is performed to only the external memory in case of a cache miss. In the write-back mode, data is written into a cache entry (cache line) regarding the hit in case of the cache hit, one cache entry is read out from the external memory in case of the cache miss (cache fill), a tag address is updated, and data is written to the cache line. A dirty bit of the cache line which was cache filled as mentioned above is set into a set state. When the cache line which is swept out from the cache memory by the cache fill is dirty, the cache line is written back to the external memory. In this manner, in case of the write-through mode, although the contents in the cache memory and the external memory are always made coincident, the number of times of access to the external memory increases. In the write-back mode, although the number of times of access to the external memory is small, in the case where a period of time during which the contents in the cache memory and the external memory don""t coincide exists and a plurality of cache memories unifies the external memory, there is a case where a consistency between the cache memory and the external memory cannot be held. If the write-through mode and the write-back mode can be selected on a page unit basis, the relation between the consistency of the cache memory and the external memory and the accessing speed can be optimized in accordance with the system construction or the contents of the process.
A data processing apparatus to which the above data processor is applied has an external memory connected to the data processor and its secondary storage.