Rapid development of the mobile Internet, cloud computing, and the Internet of Things is accompanied with a constant increase in a data amount and computing complexity in fields of engineering application and scientific computing, and it is difficult for a computing capability of a general-purpose central processing unit (CPU) to satisfy computing requirements in the fields. To satisfy computing requirements of algorithms in the fields, various accelerators appear, and are widely applied to many computing fields. An accelerator is a dedicated hardware device used to perform a particular function, is a part of a general-purpose computing system, and cannot exist without a general-purpose processor. The accelerator is directly oriented towards an application algorithm, uses a dedicated processor architecture, can well balance a relationship among performance, an area, and power consumption that are of a processor, and may be considered as a special processor. Compared with a conventional general-purpose processor, the accelerator has many unparalleled advantages. For example, in terms of both computing performance and storage bandwidth, the accelerator is far better than the general-purpose processor.
Heterogeneous convergence of “general-purpose processor+dedicated accelerator” is a development direction of an application driven processor architecture. On the one hand, the general-purpose processor can process scalar computing and provide a general computing capability, so that the heterogeneous system can be applicable to various application fields. On the other hand, the dedicated accelerator can provide strong computing performance for applications in some particular fields, so that the heterogeneous system has fine performance and relatively low energy consumption.
However, when oriented towards applications in different fields, regardless of the general-purpose processor or the dedicated accelerator, actual performance of the processor is far lower than peak performance, and is usually lower than 50%. In terms of hardware, computing and storage performance of the processor or the accelerator is becoming higher. In terms of software, how to efficiently utilize a computing capability of the processor or the accelerator and reduce programming burden of a programmer has become a key problem that is faced during program development.
A programming model mainly focuses on the foregoing problem. The programming model is abstraction for a computer hardware system structure, and establishes a relationship between the computer hardware system structure and an application programming interface, so that an application program may be executed in a computer hardware system. The programming model focuses on improving program performance, development efficiency, and scalability for another system design. A higher level of hardware abstraction of a programming model indicates smaller programming burden of a programmer and higher compilation complexity. Therefore, the programming model directly affects hardware utilization of the processor.
High performance is a main objective of the dedicated accelerator, an architecture of the dedicated accelerator is usually closely coupled to a field algorithm, and an instruction set is extremely complex. This mainly brings impact in two aspects. In one aspect, logical combination cannot be performed on complex and special instructions in the dedicated accelerator by using a basic arithmetic instruction, and consequently mapping cannot be directly performed by using a compilation technology and an advanced language. In the other aspect, there is a big difference between architectures of accelerators oriented towards different application algorithms, and a compiler needs to be modified for different processor instruction set structures, and this requires massive work. Consequently, program development performed on the dedicated accelerator is usually manually optimized, and efficiency is extremely low.