The present invention relates, in general, to the field of customizable integrated circuit devices incorporating non-volatile memory. More particularly, the present invention relates to a stack processor and method implemented using a ferroelectric random access memory (F-RAM) for code and a portion of the stack memory space having an instruction set optimized to minimize processor stack accesses.
Current nonvolatile memory technologies include, among others, electrically erasable programmable read only memory (EEPROM) and Flash memory. Despite continuing improvements to this technology, the endurance rate of Flash memory is still multiple orders of magnitude below the endurance rate of F-RAM. Therefore, for applications using Flash memory that require high endurance, some products will actually include a large Flash memory array with the associated user/program ensuring that data is stored in specific memory locations (e.g. banks of memory). Once a memory bank approaches its endurance limit, the user/program would then enable the movement of all data to a new memory bank, marking the previous memory bank as worn out and indicating it should not be used again. The typical endurance of such floating gate devices is approximately between 100,000 to 1 million write cycles.
It is also well known that writes to EEPROM and Flash is relatively slow compared to that of F-RAM. While a F-RAM write cycle is completed almost immediately, EEPROM and Flash write times take meaningfully longer. Still further, writes to an F-RAM memory cell occur at a relatively low voltage and very little current is required to change the data in the cell.
A currently implemented architecture for a stack processor is the b16 Processor described in Paysan, B., “A Forth Processor in an FPGA”, Feb. 2, 2003; Paysan, B., “b16—small—Less is More”, Jul. 9, 2006; and Paysan, B., “b16: Modern Processor Core”, Apr. 29, 2005 and disclosed at http://www.jwdt.com/˜paysan/b16.html. The b16 stack based processor has the top of the stacks maintained in volatile registers and the bottom of the stacks in two complementary metal oxide semiconductor (CMOS) memories. Such an architecture will lead to the possibility of the data and return stacks and code space being accessed simultaneously. Moreover, a stack processor architecture which provides for maintaining the stacks in volatile memory would cause it to suffer from very long and power demanding power-down times as the contents of a relatively large number of registers would have to be saved to nonvolatile memory on power-down. Placing some of the registers in nonvolatile Flash memory in an attempt to ameliorate this situation would, of course, lead to the endurance issues inherent in Flash. Still further, a stack processor architecture which utilizes a different memory type for code and the stacks would suffer from high power consumption peaks since all of the memories are likely to be accessed simultaneously in normal operation.
In the b16 stack processor, each 16 bit word is mapped as three, 5 bit instructions and one extra 1 bit instruction which can only be a “no operation” (NOP) or CALL. In practice this means that in the majority of cases, the fourth instruction will generally be a NOP and the instruction set is, therefore, wasting one bit per word along with a clock cycle (needed to execute the NOP) every three instructions. Still further, the b16 stack processor does not share code and data space so its architecture is even more power demanding as it can access code space and the data and return stacks all simultaneously.