1. Field of the Invention
This invention relates to the field of data processing systems. More particularly, this invention relates to watchpointing within data processing systems whereby memory accesses to particular memory addresses being watched are detected.
2. Description of the Prior Art
It is known in the field of data processing systems to provide watchpointing mechanisms whereby a user can configure the system to detect when one or more particular memory addresses are subject to a memory access. This type of mechanism is particularly useful when debugging software or hardware and when analysing the operation of a system. One known mechanism for performing such watchpointing is illustrated in FIGS. 1 and 2 of the accompanying drawings.
FIG. 1 shows a sequence of memory addresses 2 with the memory address for each byte being expressed with the upper 30 bits within square brackets and the two least significant bits separately enumerated. As is common within data processing systems for a variety of reasons, the memory accesses provided in the illustration of FIG. 1 are aligned memory accesses; that is the address of the first byte accessed is a multiple of the number of the bytes. The individual bytes of memory are arranged into four byte words, each byte being eight bits in length and hence each word being 32 bits in length. The boundaries 4, 6 between words are shown by double lines in FIG. 1. Since the largest memory access permitted is a four byte word access, and all accesses are aligned, it follows that memory accesses cannot span a boundary between words, i.e. memory accesses are aligned with the word boundaries.
In the example illustrated in FIG. 1, the four bytes marked with a “*” are to be read. This is achieved by performing a first memory access which includes the byte 8 followed by a second memory access which includes the bytes 10,12,14. It will be appreciated that when fewer than the full four bytes within a word are actually required, the other bytes may be returned in the aligned access, but simply ignored. It is also possible that only the required bytes maybe returned. In either case, the memory access will have an associated length corresponding to the number of bytes within that access that are actually required.
FIG. 1 illustrates that byte 12 is subject to watchpointing. The memory address of this byte is [A]:0:1. As previously explained [A], represents the upper 30 bits of the address, with the remaining bits of the address being separately enumerated. The lower two bits can be regarded as specifying a particular byte lane within the aligned memory word spanning between the word boundaries 4, 6.
FIG. 2A illustrates a watchpoint comparator which maybe used to detect memory accesses to byte 12 of FIG. 1. A request for a memory access will be generated, typically by hardware such as a processor core, a DSP unit, a peripheral device or some other bus master, and issued to the memory system. The memory access specifies a memory address HADDR, which is the start address of the memory access, together with the number of bytes HSIZE for that memory access. In the system illustrated in FIG. 2A, HSIZE may take the values 1, 2 or 4. HSIZE is typically encoded in few bits, for example as log2(HSIZE); that is a 2-bit binary value can be used with “00” encoding HSIZE=1 (20), “01” encoding HSIZE=2 (21) and “10” encoding HSIZE=4 (22). In the case of aligned memory accesses, HADDR must be aligned to HSIZE, that is HADDR is a multiple of HSIZE and as such the access will not span a word boundary 4, 6. However, for unaligned accesses HADDR is not aligned to HSIZE and as such may span a word boundary.
The upper 30 bits HADDR[31:2] of the memory access address are stored within a register 16. These are compared with corresponding bits stored as a watchpoint address COMP[31:2] within a register 18. An XNOR gate 20 indicates if the contents of the register 16 and the contents of the register 18 match. The byte lane(s) of one or more bytes of data being accessed is compared with the byte lane of the watchpoint within the byte lane portion 22 of the watchpoint comparator 15. In particular, the two least significant bits of the memory access address HADDR[1:0] and the access length HSIZE are used by a mask generator 24 to generate a 4-bit mask value in which the bit values for the non-accessed byte lanes are zero and the bit value for the accessed byte lanes are one. For aligned accesses, as these do not span a word boundary, this mask value represents all the bytes of memory accessed by the memory access. FIG. 2B shows how the mask generator 24 generates the byte lane mask, HBL[3:0]. HBL[0] corresponds to the first byte of the word, HBL[1] to the second byte, and so on. FIG. 2B includes entries for generating byte lane masks for unaligned accesses. These are not relevant in a system that only supports aligned accesses.
A corresponding byte lane mask is generated for the watchpoint address by mask generator 26 using the least significant two bits of the watchpoint address COMP[1:0] 27 together with a programmed watchpoint size value CSIZE 28. In the system illustrated in FIG. 2A, CSIZE may take the values 1, 2, or 4. CSIZE is typically encoded within register 28 in a small number of bits, for example as log2(CSIZE). The full watchpoint address, COMP[31:0] (that is, the compound of COMP[31:2] and COMP[1:0]) is required to be aligned to the value of CSIZE. Watchpointed locations are required to be aligned. FIG. 2C shows how the mask generator 26 generates the byte lane mask, CBL[3:0]. The unaligned cases in FIG. 2C show a byte lane generated of #UNP, meaning the output is not defined as this is not a permitted case. FIG. 2C also includes an entry for CSIZE>=4; this is not permitted in this embodiment but will be described in terms of other embodiments later.
If the outputs of the mask generators 24 and 26 have a common bit which is set to one in both generated masks, then the AND gates 30 and the OR gate 32 detect this and pass an output signal together with the output of the XNOR gate 22 to AND gate 34. The output from the AND gate 34 is a match signal which is asserted when the memory access specified by the memory access address HADDR[31:0] (that is, the compound of HADDR[31:2] and HADDR[1:0]) and HSIZE matches the watchpoint insofar as the top 30-bits of the address HADDR[31:2] match the top 30-bits of the watchpoint address COMP[31:2], and that at least one of the watchpointed bytes within the aligned word, as indicated by COMP[1:0] and CSIZE, is being accessed. The match signal output from the AND gate 34 can be used for a variety of different purposes, such as, for example, triggering entry into a debug mode, triggering an interrupt, triggering the generation of trace or profiling information or in some other way.
In other embodiments, the byte lane mask generated by mask generator 26 in respect of the watchpointed location would instead be explicitly defined in the watchpoint register as CBL[3:0]. Only COMP[31:2] therefore needs to be defined in the watchpoint register and COMP[1:0] and CSIZE are not defined in the watchpoint register.
In other embodiments, CSIZE can be extended to support watchpointed locations larger than 4 bytes. In this case a CSIZE can take values ranging from 1 (20) bytes to 231 bytes. CSIZE can advantageously be encoded as a 5-bit field which defines the number of bits of the address comparison to ignore, that is log2(CSIZE). The corresponding bits of COMP[31:0] must be programmed as zero. In such an embodiment, where CSIZE is greater than 4 (word), the function of XNOR comparator 20 in FIG. 2A is modified to not compare those low-order bits of HADDR specified by log2(CSIZE), and the function of mask generator 26 is modified such that it generates a mask with all bits set, as shown in FIG. 2C.
In other embodiments, CBL[3:0] and the extended CSIZE are both defined in the watchpoint register. In such an embodiment, the minimum value for CSIZE is 4 (word), indicating bits [1:0] of the address are not to be compared by XNOR comparator 20,—the CBL field defines smaller-than-word watchpoints. For CSIZE>=4, it is defined that CBL[3:0] must be programmed as all ones “1111”; for example, if CSIZE=8, indicating bits [2:0] are ignored, COMP[2] must be zero and CBL[3:0] must be all ones. It does not make sense to define sub-word matches (byte lanes) when an address mask of more than a word is defined.
This embodiment is more flexible than the previous embodiment because you can, for example, program CBL[3:0] as “0101”, setting a watchpoint on 2 bytes within a word which do not make up an aligned halfword. This is not possible where CSIZE and COMP[1:0] are used to generate the byte lane mask.
FIG. 3 of the accompanying drawings illustrates a problem which arises when the approach of FIGS. 1 and 2 is extended to systems supporting unaligned memory accesses. Within such systems memory accesses can span word boundaries 36, 38. FIG. 3 illustrates six different 4 byte memory accesses starting from respective memory locations. Memory access 40 is entirely within the word boundaries 36, 38. Memory accesses 42, 44 and 46 all span the word boundary 38. Memory access 50 does not span the word boundary 38.
In FIG. 3, byte address A is subject to watchpointing. The system of FIG. 2 in which both the upper portion HADDR[31:2] and the byte lane must match in order to register a watchpoint hit produces false negative results in respect of the memory accesses 42, 44 and 46 of FIG. 3. In the case of these three false negatives 42, 44, 46, the upper portion of the memory address HADDR[31:2] differs from that of the address A and accordingly the XNOR gate 20 of FIG. 2 produces a low output and the match signal is not generated. In FIG. 3, it is only the memory access 50 which triggers the match signal since the upper portion of the address HADDR[31:2] and the byte lane both match. Memory access 48 matches in its upper portion, but not byte lane and is a “true” negative.
The consequences of the false negatives of FIG. 3 can be severe within a debugging or analysis system. For example, if a programmer is trying to determine which portion of the system is incorrectly writing to a particular memory address A then if the errant portion is making its inappropriate write by a memory access corresponding to one of the three false negatives, this will not be detected by the watchpoint comparator 15 programmed in respect of byte A.
FIG. 4 illustrates an example of a technique to address the problem of false negatives illustrated in respect of FIG. 3. In FIG. 4 both the original byte A is subject to watchpointing as well as the immediately preceding byte A-1, which serves as a guard watchpoint. In this example, the memory accesses 40, 42, 44 and 46 will all trigger match signals from the guard watchpoint on address A-1. The match signals in respect of the memory accesses 42, 44 and 46 are correct in that those memory accesses 42, 44 and 36 do extend into the next word and access byte A, but the memory access 40 is a false positive in that whilst it triggers the guard watchpoint by virtue of accessing address A-1, it does not in fact extend beyond the word boundary and access address A. Thus, whilst the approach of FIG. 4 does not suffer from the problem of false negatives, it introduces its own problem of a false positive.
Whilst it would be possible for whatever mechanism was triggered by the match signal to conduct further analysis and identify false positives, such processing can be significantly disadvantageous. For example, a system may arise where memory access 40 is very common and correct in normal operation, with the erroneous accesses to the address A being highly infrequent. In this circumstance there would be a high number of false positives for each genuine access of interest to address A. Not only would this slow down the proper identification of a potential bug, it could also distort the processing activity by the repeated interruption to deal with false positives in a manner which masked or changed the errant behaviour.
The present technique both recognises the above problem and provides a solution to that problem.