A reconfigurable architecture is an information processing structure that can include for example several hardware processing units, also called “processing cores”, whose constituent elements and organization may be adapted so as to carry out a particular processing operation effectively. Such an architecture utilizes a reconfigurable circuit for processing information. The reconfigurable circuit may be for example coarse-grained, of the type of the “eXtreme Processing Platform” architectures produced by the company PACT for example. A coarse-grained reconfigurable circuit includes processing units allowing complex operations such as arithmetic multiplication for example. The reconfigurable circuit can also be a fine-grained circuit, such as an FPGA, the acronym standing for “Field Programmable Gate Array”. A reconfigurable circuit of fine grain includes processing units allowing logical operations, for example “or” or “and”.
Several modes of reconfiguration exist. A reconfigurable architecture may be reconfigured completely in one go, or else partially and thus enable a part of the application to be carried out effectively. It can also be reconfigured dynamically during processing, that is to say employ a mechanism which allows it to modify its configuration although the processing has not finished. In certain cases, several reconfigurations may be necessary in order to carry out the complete application. A problem with implementing dynamic reconfiguration is the recovery and transfer of the configuration information when passing from one configuration to another, notably because of the time overhead incurred. The set formed by this information can have a structure and a content which is extremely variable. Dynamic reconfigurable architectures suffer from reconfiguration latency which slows down the execution of an application because of the loading into memory of the configuration data prior to the execution of a processing operation.
Systems on chip, commonly encompassed under the acronym “SoC” standing for “System-on-Chip”, allow the integration on a single chip of several relatively complex computational elements. SoCs can thus include several reconfigurable cores, optionally of different granularities. Such an architecture is termed “heterogeneous” when the programming formats for the computational elements are not mutually compatible. The programming formats recognized by the various types of computational element may belong to various categories: they may make it possible to carry out iterative programs in the form of instruction streams or in the form of configurations describing the hardware organization of the processing units or else in mixed forms. Different pieces of software must then be used to prepare the formats in question, several compilers for example. Thus, heterogeneous dynamic reconfigurable architectures have no inherent programming model.
Solutions exist which rely on services provided by the operating system. These services are, however, limited in complexity since they are implemented in the form of software drivers which slow down the scheduler or which entail non-compliance with the real-time constraints. Other solutions use a reconfiguration manager, but they are suited rather more to homogeneous multi-core architectures. Finally, tools exist which analyze off-line, that is to say before execution, the behavior of the applications as a function of a dynamic of the inputs. They deduce therefrom a fixed scheduling of the processing operations to be performed and therefore an appropriate order of loading. These solutions do not take account of the randomnesses inherent in execution, which randomnesses often render all the predefined static preloading strategies ineffective. After analysis, it is apparent that the prior art cited above proposes essentially three basic techniques for decreasing or masking the reconfiguration latencies. These techniques are decorrelated from the static or dynamic aspect of the decision making. The first technique consists in compressing and decompressing the bitstreams representing the configuration information. Indeed, to each configuration there corresponds a bitstream which describes the configuration, the bitstream having to be transferred so as to load the corresponding configuration. The second technique consists in preloading bitstreams into cache memory. The third technique consists in reusing the already executed configurations by storing the corresponding bitstream in cache memory. But none of these techniques provides a really significant improvement and dynamic reconfigurable architectures continue to suffer from reconfiguration latencies
American patent application U.S. 2004/0260884 A1 describes a reconfigurable system including several reconfigurable processors, one of the processors including a preloading unit making it possible to read and to write data in a shared memory. However, this patent application does not describe the precharging of the bitstreams representing the system configuration information.