Field of the Invention
The present invention is concerned with a data processing apparatus configured to perform an element comparison process between a first vector of data elements and a second vector of data elements. In particular, the present invention is concerned with generating a hazard vector indicative of matches found by the element comparison process.
Description of the Prior Art
It is known to provide data processing apparatus comprising a comparison unit which is configured to perform an element comparison process between a first vector of data elements and a second vector of data elements. This may for example be done in the situation where the first and second vectors are vectors of memory addresses representing the storage locations of data elements on which a set of data processing operations are to be performed. If it is sought to be determined if this set of data processing operations can be performed in parallel to one another (vector processing), an initial comparison process performed on the two vectors of memory addresses can determine if there are any memory addresses which occur in both vectors, and hence the parallel performance of the set of data operations could result in a data hazard condition.
For example, US Patent Application Publication 2008/0288754 A1 describes an instruction (“CheckHazard”) which can compare two vectors of memory addresses to detect if there are one or more critical memory hazards between memory items referenced by the elements of each vector. Similar disclosures are made by the commonly assigned US Patent Application Publications 2008/0288744, US 2008/0288745 and US 2008/0288759. Whilst the CheckHazard instruction enables memory carried dependencies between the two vectors of addresses to be determined, this is done by comparing the addresses in one vector with all addresses at lower vector indices in another vector, which may be a significant number of operations. For example, for 8-element vectors, this comprises 28 comparisons (28=M*(M−1)/2 where M=8).
The operation of the CheckHazard instruction is schematically illustrated in FIGS. 1A, 1B and 1C. FIG. 1A illustrates the comparison of two 8-element vectors A and B, the elements of which are memory addresses which will be used in a subsequent set of data processing operations. In order to determine if the set of data operations on the respective elements of vector A and vector B can be carried out in parallel with one another, the CheckHazard instruction is executed, taking vector A and vector B as its input and generating a result vector R. As shown in FIG. 1A, the occurrences of memory addresses 101, 103 and 105 in both of vectors A and B results in these matches being indicated within the result vector R. For example, the “3” in the fourth index position of vector R indicates that a matching memory address occurs for the fourth index position of vector A at the third index position of vector B. Based on the result of vector R, the vectors A and B can be subdivided into partitions, which define subsets of the two vectors which can be processed in parallel without a data hazard occurring.
FIG. 1B shows a pseudo-code example implementing the CheckHazard procedure. The 28 comparisons performed for the index positions X and Y of 8-element vectors A and B are shown in FIG. 1C.
Whilst it is beneficial to be able to perform an element comparison process on two vectors, such as by means of the CheckHazard instruction, the number of comparison operations which must be carried out can be prohibitively large. This is due to the fact that the number of comparisons is quadratic with respect to the vector length, for example as mentioned above 28 comparisons being necessary for a pair of 8-element vectors, whilst 120 comparisons are necessary for 16-element vectors. The potentially large number of comparison operations means that the system designer is faced with a choice between providing a large number of comparators, or providing a smaller number of comparators which must iteratively perform the comparisons over many cycles.
Whether the system designer chooses to take the multiple comparator approach or the multiple cycle approach, both approaches suffer from the drawback that scaling the comparison process to operate on longer vectors can become difficult. Accordingly, it would be desirable to provide an improved technique for enabling such comparison operations to be carried out.