1. Field of the Invention
The present invention relates to a data processing apparatus and method for performing speculative vector access operations.
2. Description of the Prior Art
One known technique for improving performance of a data processing apparatus is to provide circuitry to support execution of vector operations. Vector operations are performed on at least one vector operand, where each vector operand comprises a plurality of vector elements. Performance of the vector operation then involves applying an operation repetitively across the various vector elements within the vector operand(s).
In typical data processing systems that support performance of vector operations, a vector register bank will be provided for storing the vector operands. Hence, by way of example, each vector register within a vector register bank may store a vector operand comprising a plurality of vector elements.
In high performance implementations, it is also known to provide vector processing circuitry (often referred to as SIMD (Single Instruction Multiple Data) processing circuitry) which can perform the required operation in parallel on the various vector elements within the vector operands. In an alternative embodiment, scalar processing circuitry can still be used to implement the vector operation, but in this instance the vector operation is implemented by iterative execution of an operation through the scalar processing circuitry, with each iteration operating on different vector elements of the vector operands.
Through the use of vector operations, significant performance benefits can be realised when compared with the performance of an equivalent series of scalar operations.
One type of vector operation is a vector access operation, which may take the form of a vector load operation used to load at least one vector operand from cache/memory into the vector register bank, or a vector store operation used to store at least one vector operand from the vector register bank into the cache/memory (the cache/memory also being referred to herein as a data store).
When seeking to gain the performance benefits of vector processing, it is known to seek to vectorise a series of scalar operations in order to replace them with an equivalent series of vector operations. For example, for a loop containing a series of scalar instructions, it may be possible to vectorise that loop by replacing the series of scalar instructions with an equivalent series of vector instructions, with the vector operands containing, as vector elements, elements relating to different iterations of the original scalar loop.
However, whilst such an approach can work well when the number of iterations required through the original scalar loop is predetermined, it is more difficult to vectorise such loops when the number of iterations is not predetermined. In particular, since the number of iterations is not predetermined, it cannot be predetermined how many vector elements will be required in each vector operand.
In some situations of the above type, it is possible to perform speculative vector processing, where a speculation is made as to the required number of vector elements, and remedial action is taken later when the exact number of vector elements required is determined. Considering the earlier mentioned vector access operations, it is known to perform such speculation in association with vector load operations, since if an over speculation is made, this will merely result in data being stored in the vector register bank that can later be deleted as part of the remedial action when the exact number of vector elements required is determined. However, for vector store operations, such speculation is problematic, since vector store operations cause the contents of cache/memory to be updated, which may prevent required remedial action being taken.
The Ph.D. thesis entitled “Vector Microprocessors” by K Asanovic, Berkeley, 1998, pp. 116-121, teaches that one limited approach to providing speculative memory loads is to provide a read-ahead buffer area after every memory segment. This read ahead would guarantee that reads to some region after a valid pointer would not cause address errors. However this software technique approach only provides speculation for unit-stride and small stride memory loads and so it is not suitable for use when vectorizing programs with more complex memory access patterns. Further it does not enable speculative vector store operations to be performed.
Accordingly, when loops of scalar instructions include one or more store instructions, and the number of iterations of the loop is not predetermined, it has traditionally been considered that such loops cannot be subjected to speculative vectorisation.