The present invention relates to the field of memories. More particularly, this invention relates to an apparatus and method that support non-aligned memory accesses.
Computer systems include memory which is organized as a number of words. Each word includes a collection of bits which can generally be accessed at the same time. For example, each word may be 16, 32, 64, 128, etc. bits wide. In big endian order, the bits can be numbered with the most significant bit on the left, and the least significant bit on the right. For example, a 32-bit wide word can be numbered from bit 31 (i.e., the most significant bit) on the left, to bit 0 (i.e., the least significant bit on the right). Often, each word is divided into F uniform fields, with each field having B bits. A common width for each field is eight bits (i.e., B=8), which comprises a byte or an xe2x80x9coctetxe2x80x9d. Thus, a 32-bit wide word is often organized as four fields (i.e., F=4) having eight bits (i.e., B=8) each. The fields are stored in memory from lower addresses to higher addresses. In big endian order, addresses of fields in a full-width word can be numbered starting with zero on the left, with increasing addresses to the right. With a word organized into F fields of B bits each, the alignment or offset of an address is defined as the remainder when the address is divided by F. When the alignment of a word is zero, the word is aligned in memory, and any access (e.g., read or write) of that word is aligned. When, however, the alignment of a word is non-zero, the word is non-aligned, and any access of that word is non-aligned.
FIG. 1, for example, provides a graphical representation 100 of three 32-bit words stored in a memory. Each 32-bit word is divided into four fields (i.e., F=4) of eight bits (i.e., B=8) each, with the first, second, third and fourth fields including bits 31-24, 23-16, 15-8 and 7-0, respectively. In this example, the first word stores data ABCD from a starting address xe2x80x9ca+0xe2x80x9d to an ending address xe2x80x9ca+3xe2x80x9d, the second word stores data EFGH from a starting address xe2x80x9ca+4xe2x80x9d to an ending address xe2x80x9ca+7xe2x80x9d, and the third word stores data IJKL from a starting address xe2x80x9ca+8xe2x80x9d to an ending address xe2x80x9ca+11xe2x80x9d. Thus, all three of these 32-bit words are stored with a 0 boundary alignment (i.e., a 0 offset) since 0 is the remainder when the starting addresses xe2x80x9ca+0xe2x80x9d, xe2x80x9ca+4xe2x80x9d and xe2x80x9ca+8xe2x80x9d are divided by 4. Thus, these three words are aligned in memory, and any access will be an aligned access.
In contrast, assume that the 32-bit words stored in the memory are non-aligned. For example, assume that the first word stores the data BCDE from a starting address xe2x80x9ca+1xe2x80x9d to an ending address xe2x80x9ca+4xe2x80x9d, and a second word stores the data FGHI from a starting address xe2x80x9ca+5xe2x80x9d to an ending address xe2x80x9ca+8xe2x80x9d. In this case, both 32-bit words are stored with a xe2x80x9c+1xe2x80x9d boundary alignment (i.e., a xe2x80x9c+1xe2x80x9d offset) since 1 is the remainder when the starting addresses xe2x80x9ca+1xe2x80x9d and xe2x80x9ca+5xe2x80x9d are divided by 4. Thus, any access of either word will be a non-aligned access. Similarly, a first 32-bit word storing CDEF and a second 32-bit word storing GHIJ would be stored with a xe2x80x9c+2xe2x80x9d alignment since 2 is the remainder when starting addresses xe2x80x9ca+2xe2x80x9d and xe2x80x9ca+6xe2x80x9d are divided by 4, and a first 32-bit word storing DEFG and a second 32-bit word storing HIJK would be stored with a xe2x80x9c+3xe2x80x9d alignment since 3 is the remainder when starting addresses xe2x80x9ca+3xe2x80x9d and xe2x80x9ca+7xe2x80x9d are divided by 4. In these cases, an access of any of these non-aligned words would require a non-aligned memory access.
Generally, memories support only aligned memory accesses in a single clock cycle, and are not configured to support non-aligned memory accesses. For example, in a memory organized as four fields (i.e., F=4) of eight bits (i.e., B=8) each, only accesses of aligned 32-bit words with starting addresses xe2x80x9ca+0xe2x80x9d, xe2x80x9ca+4xe2x80x9d, xe2x80x9ca+8xe2x80x9d, etc. (i.e., starting addresses where the remainder of the address divided by 4 is 0) can take place in a single clock cycle, while accesses of non-aligned 32-bit words with starting addresses xe2x80x9ca+1xe2x80x9d, xe2x80x9ca+2xe2x80x9d, xe2x80x9ca+3xe2x80x9d, xe2x80x9ca+5xe2x80x9d, etc. (i.e., starting addresses where the remainder of the address divided by 4 is non-0) are not supported. Thus, using the data of FIG. 1, accesses to the aligned 32-bit word storing ABCD can take place in a single clock cycle, while accesses to the non-aligned 32-bit words that store BCDE, CDEF or DEFG are not supported.
One solution to the problem of performing non-aligned memory accesses involves translating a single non-aligned memory access into two aligned accesses, and properly merging the results of the two aligned accesses. For example, a single access of the non-aligned 32-bit word that stores data BCDE starting at address xe2x80x9ca+1xe2x80x9d could be translated into a first aligned access starting at address xe2x80x9ca+0xe2x80x9d and a second aligned access starting at address xe2x80x9ca+4xe2x80x9d, followed by a merger of the results of these two aligned accesses. This scheme, unfortunately, requires two aligned memory accesses plus additional processing, and cannot be completed in a single clock cycle. Thus, a non-aligned memory access performed using this scheme will take longer to complete than an aligned memory access.
Therefore, it would be desirable to provide an apparatus and method that support non-aligned memory accesses without needing translation into multiple aligned accesses. Such an apparatus and method may be less complex and may be performed more quickly than the conventional solution described above for performing the non-aligned accesses.