1. Field of the Invention
This invention relates to the field of microprocessor-based computers and more particularly to a method and circuit for detecting potential limit violations (PLVs), the method and circuit resulting in faster detection of PLVs by generating a PLV signal in parallel with the generation of a definite limit violation (DLV) signal.
2. Description of the Relevant Art
In many microprocessor-based computing systems, including the popular X86 based microprocessors, more than one type of memory address is used. For example, X86 processors use three different memory address formats: physical addresses, linear addresses, and virtual addresses. Application software programs written for X86 based systems generally use virtual addresses to reference memory locations. Virtual addresses (also known as logical addresses) are addresses containing two parts--a base address and an offset from the base address. This two part address must be translated or mapped into a physical memory address by an address translator.
Virtual addresses are useful because they enable the concept of virtual memory. Virtual memory refers to the ability of the software to reference more memory locations than are present in the system's physical memory. In a 32-bit based system, for example, the physical memory address space is equal to 2.sup.32 or 4 gigabytes. This is the maximum amount of system memory that can be accessed by the microprocessor. In contrast, the virtual memory address space in a 386/486 type system is much larger. In such a system, the base address (also known as the selector) is a 14-bit number while the offset is a 32-bit number. The virtual address space, therefore, is 64 Terabytes (2.sup.46).
Programming with virtual memory addresses is beneficial because the programmer is not constrained by the amount of physical memory on the user's system. When the computing system translates the virtual address into a linear or physical address, a check is performed to determine if the translated address is currently residing in the physical memory of the system. If the computing system determines that the translated address is not currently residing in the physical memory, the system retrieves the information associated with the translated address from a storage device, typically a hard disk and stores the information into the system memory. Using this translation method, software programs can be written without regard to the amount of physical memory residing on the system that is executing the program. In addition, the use of virtual memory and address translation facilitates the protection of specified physical memory addresses. Imagine, for example, a system in which the operating system software resides in the first two megabytes of system memory. In such a system, it is desirable to restrict application programs from accessing these first two megabytes of physical memory. This result can be achieved by insuring that no virtual addresses are mapped into physical addresses 0 through 2M (2.sup.21).
The virtual address scheme further facilitates the protection of certain areas of physical memory by checking to insure that every address referenced by the processor is within certain permissible boundaries. These boundaries are generally defined as a maximum offset from a particular base value. Virtual memory systems divide the address space into segments. The definition of each memory segment includes a limit value which, together with the base address (the address associated with the first memory location within the segment) define the size of the segment and the range of permissible virtual addresses within that segment. Once the boundaries of a segment have been defined, virtual memory references to that segment may be easily checked to determine whether the referenced address is within the limits of the segment by simply comparing the offset of the referenced location to the limit value. For this reason, address limit checking is typically done using virtual representation of a memory address.
In microprocessor systems that utilize a cache memory array to enhance performance, limit checking becomes somewhat more complicated partly because linear addresses, rather than virtual addresses are commonly used to access the cache. In cached systems, not only must every memory reference be compared to a limit value, each memory reference must also be compared to an array of addresses stored in the cache to determine if the information associated with the referenced address is currently residing in the cache. A cache memory is a high speed memory unit interposed in the memory hierarchy of a computer system between a relatively slow system memory and a central processing unit to improve effective memory transfer rates and accordingly improve system performance. The name refers to the fact that the small cache memory unit is essentially hidden and appears transparent to the user, who is aware only of a larger system memory.
The cache is usually implemented by semiconductor memory devices, such as static RAMs, having speeds that are comparable to the speed of the processor while the system memory utilizes a less costly, lower speed devices, such as dynamic RAMs. The cache concept anticipates the likely reuse by the microprocessor of selected data in system memory by storing a copy of the selected data in the cache memory. A cache memory typically includes a plurality of memory sections, wherein each memory section stores a block or line of two or more words of data. For example, an 8 Kbyte cache could be arranged as 512 lines wherein each line contains 16 bytes of information. Each line has associated with it an address tag. When the processor initiates an access to system memory, a comparison is made between the memory address and the array of address tags to determine whether a copy of the requested information resides in the cache memory. This address comparison is commonly made using an address format other than virtual (i.e., linear or physical).
A problem exists in microprocessor-based computing systems utilizing cache memories because limit checking is typically accomplished by comparing virtual addresses while comparisons between a memory address and the tags of a cache memory array are typically done with linear or physical addresses. Because it is highly desirable to detect limit violations as early as possible, it would be desirable to effect a method for checking limit violations in which the limit detection is accomplished during the time when the processor is accessing the cache memory array.
Detecting limit violations typically includes the steps of comparing the offset of a virtual memory address with a limit value. If the offset is greater than the limit value, a limit violation has occurred and the processor is informed so that it may take appropriate action. When combined with the utilization of a cache memory array however, the limit checking process becomes more complex because of the address format distinction noted above and further because each cache memory tag typically includes only the most significant bits of a memory address. A line of cache memory, as noted above, may contain multiple consecutive memory locations. The cache array tags therefore, are generally required to contain only the most significant bits of the address field. For example, if the cache memory array is organized into 512 lines wherein each line contains 16 sequential bytes, then the tags need not contain the least significant four bits of the memory address. If a limit value falls intermediate to one of these 16 byte blocks, the processor will not be able to definitively determine whether a limit violation has occurred during the comparison with the cache array tags because the tags are less precise than the limit addresses (i.e., the tags do not utilize the least significant bits of the limit value). As a result, a comparison of a requested memory address and the tag field of a cache memory array can produce three limit violation outcomes.
The first outcome, known as a definite limit violation (DLV) occurs when the most significant address bits of the requested memory address exceed the corresponding most significant bits of the limit value. In this case, a limit violation has definitely occurred because the address of the referenced memory location will exceed the limit value regardless of the values of the least significant bits. The CPU should be signaled so that it can take appropriate action. A second outcome occurs when the most significant bits of a requested memory address are less than the most significant bits of a logical limit value. In this case, a limit violation has definitely not occurred for analogous reasons. The third situation, known as a potential limit violation (PLV), occurs when the most significant bits of the requested memory address are equal to the cache tag. When this condition occurs, it is not definitely known whether the requested memory address exceeds the limit value and the processor must be so informed so that the it can perform additional operations to determine if a limit violation has occurred.
FIG. 1 is a block diagram of a conventional circuit for generating potential and definite limit violation signals in a system in which the linear address format is used when accessing the cache. Linear address 4 is an n-bit signal that represents the linear address of a memory location. Limit checking is accomplished by comparing the offset of the linear address against a limit value. Therefore, to determine whether a given linear address represents a potential or DLV, it is necessary to convert the linear address to its virtual address equivalent for direct comparison with the logical limit. To convert linear address 4 to its virtual address, the base address must be subtracted from the linear address. To accomplish this task, a base address signal 6 is provided. Subtracting base address 6 from linear address 4 yields the virtual address offset of linear address 4. This offset can then be directly compared against limit 8. Thus, the virtual limit 8 is provided to the circuit so that it may be subtracted from the offset address.
Typically, linear address 4, complemented base address 6, and complemented logical limit 8 are routed to a full adder circuit 12. Full adder circuit 12 includes n 3-to-2 adders for combining the three inputs. As is well known in the field of digital logic, a full adder circuit generates a sum bit and a carry bit that are dependent upon the inputs. Full adder circuit 12 comprises n full adders in parallel, and therefore produces as a result a sum signal 16 comprising n sum bits and an n bit carry signal 14.
Sum signal 16 and carry signal 14 are routed to carry lookahead adder 18. Carry lookahead adders are well known circuits for performing fast addition operations. A generalized carry lookahead adder is described in John L. Hennessy and David A Patterson, Computer Architecture, a Quantitative Approach (Morgan Kaufmann 1990) p.A-32 through A-36. Carry lookahead adder 18 includes a generate and propagate bits circuit 20, a carry bit circuit 26, and a sum bit circuit 32. Generate and propagate bits circuit 20 receives carry signal 14 and sum signal 16 and produces propagate signal 22 and generate signal 24. Propagate signal 22 is referred to as p.sub.i (n-1:0) and generate signal 24 is represented as g.sub.i (n-1:0) where p.sub.i is equal to (carry.sub.i) OR (sum.sub.i) and g.sub.i is equal to (carry.sub.i) AND (sum.sub.i). After generate and propagate bits circuit 20 has computed propagate signal 22 and generate signal 24, those signals are routed to carry bits circuit 26.
As its name implies, carry bits circuit 26 is responsible for producing carry signal 30 in response to receipt of propagate signal 22 and generate signal 24. Carry signal 30 is then routed to sum bits circuit 32 where it is combined with carry signal 14 and sum signal 16 to produce result 34. Result signal 34 is merely the digital representation of linear address 4--base address 6--logical limit 8. Carry bits circuit 26 produces, in addition to carry signal 30, carry out signal 27. Carry out signal 27 is indicative of whether the linear address is greater than the sum of base address 6 and logical limit 8. If carry out signal 27 indicates that linear address 4 is greater than the sum of base address 6 and logical limit 8, then a DLV has occurred. If, on the other hand, carry out signal 27 indicates that linear address 4 is not greater than (i.e., is less that or equal to) the sum of base address 6 and logical limit 8, then no DLV has occurred. Accordingly, it is seen that carry out signal 27 can be used as a DLV signal.
Because carry out signal 27 is generated prior to the generation of result signal 34, it is possible to know whether a DLV has occurred before one can know whether a PLV has occurred using the circuit shown in FIG. 1. Additional operations must be performed on carry signal 30, carry signal 14, and sum signal 16 before a PLV signal is available. In particular, carry signal 30, carry signal 14, and some signal 16, must be operated upon by sum bits circuit 32 to obtain result signal 34. Result signal 34 is then routed to a comparator 36 which determines if each bit within result signal 34 is equal to 0. If each bit within result signal 34 is equal to 0, comparator circuit 36 generates signal 38 which is indicative of whether each bit within result signal 34 is 0. If each bit within result signal 34 is 0, then linear address 4 is equal to the sum of base address 6 and logical limit 8 meaning that a PLV has occurred. If, on the other hand, one or more of the bits within result signal 34 is equal to 1, then linear address 4 is not equal to the sum of base address 6 and logical limit 8 and therefore, no PLV has occurred. It can be seen that output signal 38 of comparator circuit 36 is indicative of whether a PLV has occurred.
It will be appreciated to one skilled in the art that the circuit shown in FIG. 1 generates a DLV signal 27 prior to the time that the circuit generates PLV signal 38. Because the computer system requires both signals to fully determine whether a limit violation has occurred, the system must await the generation of PLV signal 38 before it can resume processing. The amount of time that elapses between the generation of DLV signal 27 and the generation of PLV signal 38 represent a limitation on system performance. It is therefore highly desirable to minimize or eliminate entirely the delay between the generation of DLV signal 27 and PLV 38.