1. Field of the Invention
The present invention relates to the technology for use with a data processing device, and more specifically to the technology of storing data in a storage unit for temporarily holding data.
2. Description of the Related Art
There have been orders of storing data in a storage unit of a data processing device, that is, a big ending form and a little ending form. A “big ending” means that the high-order value is stored at the lowest address in the storage area. For example, when the hexadecimal numbers “12345678” are to be stored in memory in the big ending form, the values “12”, “34”, “56”, and “78” are stored in order from the lowest address. On the other hand, in the little ending form, the low-order value is stored at the lowest address in the storage area. That is, in the above-mentioned example, the values “78”, “56”, “34”, and “12” are stored in order from the lowest address.
Some of the currently marketed data processing devices support both of the above-mentioned endians in accessing the internal cache memory. With the data processing device, the memory is accessed in the little endian form, for example, in the following cases.                1. When memory access is set in the little endian form in the register in which the operation state of a processor is set.        2. When an instruction to access memory in the little endian form is described in a program executed in the data processing device.        3. When an instruction to access memory in the big endian form is issued in 1 and 2 above, and when switching the endian is specified in the address conversion buffer (also referred to as a TLB (translation lookaside buffer)) prepared for management of an address of cache memory.        
Since the case 3 above is to be considered, it is not determined until a retrieval result for the TLB is obtained as to which endian is to be adopted in aligning data when data is stored in the cache memory.
On the other hand, the data to be stored (store data) in the cache memory is issued immediately after the completion of the arithmetic process in the arithmetic unit. However, normally the store data is temporarily stored in the store buffer memory before being stored in the buffer memory. Since the store data is stored in the store buffer memory regardless of the retrieval in the TLB, the data processing device which supports both of the endians cannot normally select an endian for storage in the store buffer memory.
Described below is the configuration shown in FIG. 1. FIG. 1 shows the configuration of the conventional data processing device. In FIG. 1, the device includes: an arithmetic unit 101; a store buffer 102; data selectors 103, 107, and 108; alignment circuits 104 and 109; cache memory 105; and a buffer 106. The data processing device shown in FIG. 1 supports both endians described above in accessing the cache memory 105.
When a request to store data is issued, the data obtained as a result of the arithmetic process performed by the arithmetic unit 101 is first stored directly in the store buffer 102. The data stored in the store buffer 102 is selected and read by the data selector 103, and then stored in the cache memory 105. The cache memory 105 is provided in the data processing device, and is larger in storage capacity than the store buffer 102. When the data read from the store buffer 102 is stored in the cache memory 105, the alignment circuit 104 realigns the data.
The alignment circuit 104 realigns the stored data in byte unit based on the store length indicating the word length of data stored in the cache memory 105, the alignment code indicating the right adjust or left adjust of the location of the data in the storage area when the data is stored in the cache memory 105, and the above-mentioned endian. Assuming that the word length of the data in the data processing device is 8 bytes, it is necessary for the alignment circuit 104 to have a circuit configuration such that 8-way data can be selected for each byte can be selected, that is, the data can be selected from a total of 8 bytes formed by the 0-th through 7-th bytes input into the alignment circuit 104 as the data of each of the 0-th through 7-th bytes output from the alignment circuit 104.
When the data stored in the cache memory 105 is fetched at a fetch request issued after a request to store data, the data stored in the cache memory 105 is read and temporarily stored in the buffer 106, the stored data is selected and read by the data selector 108, and the data is realigned by the alignment circuit 109 as by the alignment circuit 104 in a byte unit, and is then input into the arithmetic unit 101. At this time, if the data to be processed at the fetch request has not been stored in the cache memory 105, the target data is selected by the data selector 107 and read from the store buffer 102, and input into the arithmetic unit 101 through the data selector 108 and the alignment circuit 109.
As described above, with the configuration shown in FIG. 1, the stored data is realigned by the alignment circuit 104 provided immediately before the cache memory 105 in the endian order. With this configuration, the stored data can be realigned after obtaining the retrieval result from the TLB.
Conventionally, as described above, the data stored in the cache memory 105 has been collectively aligned immediately before storing the data by the alignment circuit 104 directly connected to the cache memory 105. Therefore, the alignment circuit 104 is complicated in circuit configuration, and the circuit is very large. On the other hand, the requirement for the delay time allowed when data is written to the cache memory 105, and the requirement for the variations in delay time among simultaneously written data to the cache memory 105 are normally strict. However, these requirements are directed toward the alignment circuit 104 directly connected to the cache memory 105.