Vector operations, including matrix operations, are performed in the fields such as machine learning, pattern recognition, image processing, and graph computation. For example, deep learning algorithms developed in the past years enjoy high recognition accuracy and better parallelizability.
On the one hand, after a vector read instruction, there may be a vector write instruction which has an overlapping relationship with the address accessed by the vector read instruction, i.e. the write-after-read conflict; on the other hand, after a vector write instruction, there may be a vector read instruction which has an overlapping relationship with the address accessed by the vector write instruction, i.e. the read-after-write conflict.
Conventional graphics processors can support various complex vector operations by the execution of general-purpose SIMD (Single Instruction Multiple Data) instructions with a general-purpose register file and a general-purpose stream processing unit. However, conventional graphics processors mainly execute graphics and image operations, cache the read data through a great number of additional registers, and perform reading and execute computations in parallel through a great number of computing components. Therefore, RAM (Random Access Memory) is preferred for supporting a plurality of read-write channels but the size of the on-chip RAM is limited. Meanwhile, when there are a limited number of computing components and it requires a great amount of data operations, there are still a great number of instructions required, and thus the overhead for front-end encoding is increased.
General-purpose processors use SISD (Single Instruction Single Data stream) instructions for complex vector operations. In this way, a great number of instructions are required for processing vector operations, and similarly, when the read-after-write conflicts and write-after-read conflicts are being processed, more register groups are required for recording relevant information, thereby greatly increasing the overhead for vector operations.