Embodiments of the present invention relate to a processor-based system, and more particularly to a system including multiple sequencers of different instruction set architectures.
Computer systems include various components to process and communicate data. Typical systems include one or multiple processors, each of which may include multiple cores, along with associated memories, input/output (I/O) devices and other such components. To improve computation efficiencies, computation accelerators, special-purpose I/O devices and other such specialized units may be provided via one or more specialized components, referred to generically herein as helper units. However, inefficiencies may occur in using such helper units, as in a typical computing environment that implements a general-purpose processor and an industry-standard operating system (OS) environment, a software stack can impede efficient usage. That is, in a typical OS environment, system software is isolated from application software via different privilege levels, and operations in each of these different privilege levels are subject to OS context save and restore operations, among other limitations. Further, helper units typically lack the ability to handle processing of exceptions and faults that allow robust handling of certain events during execution.
Classic examples of a computation accelerator are coprocessors such as math coprocessors like so-called x87 floating point coprocessors for early x86 processors. Typically, such coprocessors are coupled to a main processor (e.g., a central processing unit (CPU)) via a coprocessor interface, which is of a common instruction set architecture (ISA) as the main processor. More recently, separate resources having different instruction set architectures (ISAs) have appeared in systems.
Compilers generally are translation tools that convert a high level program in a high level source language, such as C or C++ among many others, into object code executable on a processor. Compilers may include functionality that allows the definition of inline, hand written assembly code within a main program that is executable on the ISA of a main processor, or on a coprocessor. Such assembly code may be translated by an assembler dynamically linked to a compiler for the high level language for the language in which the main program is written.
In some compilation systems, directives are available for a programmer to specify parallel execution of threads. For example, an OpenMP parallel pragma may be used to demarcate a program region for fork-join parallel thread execution. When such a construct is encountered, denoted by the parallel directive, a number of threads are spawned (the thread team) to execute the dynamic extent of a parallel region. This team of threads, including the main thread that spawned them, participates in the parallel computation. At the conclusion of the parallel region, the main thread waits at an implied barrier until all threads in the thread team complete execution. The main thread then resumes serial execution. Through the use of additional clauses, the programmer can specify attributes for the thread team; for example, the num_threads clause indicates the number threads to create.
As is well known in the art, many different sets of syntax may be used to demarcate inline code and threads for parallel execution.