Microcontroller units (MCUs) are increasingly used in systems for providing automated control and for sensing applications. Example applications for MCUs include industrial controls, medical instruments and medical technologies, metering including remote metering such as utility and network metering, automotive applications, telecommunications including cellular base stations, and use on a variety of portable computing platforms including tablet computers, smart watches, smart phones, and the like. Additional applications include remote sensing and equipment monitoring, RF tag sensing such as used in toll systems, retail security and asset location, and in enabling “Internet of Things” or “IoT” applications. Demand for portable and battery powered implementations for MCUs are increasing. Because these applications often require receiving analog signals as inputs from sensing devices, mixed signal processors (MSPs) have also been introduced. Prior known MSP devices often include embedded analog to digital converters and analog comparison functions along with microprocessor units. The analog circuitry is used to receive analog input signals and to convert these to digital representations for use in performing computations. Additional example analog sensors include pressure, temperature, speed and rotation sensors, gyroscopes, accelerometers, optical sensors and the like.
While embedded microprocessors are currently used in MCUs and MSPs to perform various functions, these devices are increasingly used in applications where both stand-by and active device power consumption are of great importance. While adding functionality to increase computational performance of a microcontroller unit is always desirable, and demand for these added computation features is always increasing, the need for reduced power consumption is also increasing. Reducing power consumption results in longer battery life, extending time between battery charges or between battery replacements, and increases the time between needed services of remote sensing equipment, for example. For a portable consumer device, a battery life of at least one day in very active use is particularly desirable so that the consumer does not have to find a charging location while using the device away from home or office locations, for example.
Data processing tasks that are commonly performed by such mixed signal control and sensing devices typically include vector operations. Vector operations are often used in signal processing applications. Typical operations using vector computations include Fourier transforms such as Fast Fourier Transforms (FFT), Finite Impulse Response filtering (FIR), Infinite Impulse Response (IIR), cryptanalysis computations, and similar vector functions. While the microprocessor embedded within a microcontroller device needs to be able to perform general processing computing functions such as controlling memory accesses, data input and output functions, display and user input, communications, data transmission and the like, the need for performing these vector arithmetic functions creates a challenge for efficient computation in most general purpose microprocessors. In order to achieve high computation performance for these vector operations, a variety of prior known approaches have been used. In one approach, a digital signal processor (DSP) can be added to an integrated circuit MCU or to an integrated circuit or module that includes a microprocessor unit. While the added DSP can efficiently perform certain signal processing functions such as vector operations much faster than can be achieved by using software running instructions on the MPU, the added DSP also substantially increases the number of transistors (increases gate count) and silicon area used to implement the integrated microcontroller device, and the corresponding costs for device production also rise. Further the addition of a DSP to a microcontroller device adds additional functionality and increases silicon area for certain features of the DSP which are not necessary just for performing the vector operations. In addition, because for CMOS semiconductor technology currently in use, in CMOS integrated circuit devices the power consumed is roughly directly proportional to the number of transistors (or gates) on the device, active device power consumption tends to increase in roughly direct proportion with increasing device performance when this approach is used. This is problematic for any integrated circuit design and is particularly undesirable for the applications considered here, where in fact a substantial decrease in power consumption is needed.
Additional prior known approaches include the use of dedicated hardware accelerators specifically designed to perform certain vector operations. While performance will be increased using these dedicated hardware accelerators for each vector operation to be computed, this approach also tends to increase silicon area as a separate hardware function is added for each type of vector computation to be accelerated. Further the time to market and integrated circuit design process can be quite lengthy when using a dedicated hardware solution, as the dedicated hardware needs to be changed to address different applications. While computational performance will be increased when a dedicated hardware block is used to execute certain vector computations, the disadvantages of non-flexibility and an inability to modify the computations outweigh the potential benefits. Further, dedicated hardware accelerators are not used for operations other than the particular dedicated function are being performed, so the integrated circuit designs with dedicated hardware accelerators can be an inefficient use of silicon area, depending on how often the particular function is performed.
A continuing and increasing need thus exists for an accelerator processor architecture that is compatible with current and future CMOS integrated circuit technology, which is optimized for commonly used vector arithmetic operations, and which provides excellent computational performance with reduced silicon area and reduced gate count and correspondingly, exhibits reduced power consumption when compared to the prior known solutions.