FIG. 1 illustrates the Harvard computer architecture, which is a processor architecture having physically separate storage and signal pathways for instructions and data. FIG. 1 shows the traditional Harvard architecture with a central processing unit (CPU) having two separate buses, an instruction bus connected to instruction memory and a data bus connected to data memory. Being physically separate buses, operations to and from each bus can proceed in parallel, so that an instruction fetch may be done at the same time as data being read from and written to the data memory.
In practice, many processors implement a modified Harvard architecture, as illustrated in FIG. 2. An arbiter is inserted between the processor and the instruction memory to allow access to the memory from either the instruction bus or data bus (the arbiter will allow access from either, but not both at the same time). In an Application Specific Integrated Circuit (ASIC) implementation, a modified Harvard Architecture ASIC includes one or more CPU cores. Each CPU core typically has two Static Random Access Memories (SRAMs) instantiated, one for instructions (‘instruction RAM’) and one for data (‘data RAM’). Additionally, the ASIC has a memory interface (not shown) to access external memory, such as DRAM (Dynamic RAM) memory. An example of a commercial modified Harvard Architecture Processor is the Tensilica Xtensa™ architecture, where Tensilica™ is part of Cadence Design Systems of San Jose, Calif.
The modified Harvard architecture allows the contents of the instruction memory to be accessed as if it were data. As is well known, a modified Harvard Architecture has the characteristics that 1) instruction and data memories occupy different address spaces; and 2) instruction and data memories have separate hardware pathways to the central processing unit that allow instructions to be fetched and data to be accessed at the same time.
In a modified Harvard Architecture processor, the instruction RAM is only used for code during normal runtime operations while providing options to perform initial loading of program coding into the instruction RAM. As shown in FIG. 3, one advantage of a modified Harvard Architecture is that it permits convenient initial or run-time loading of program code into the instruction RAM (I-RAM in the Figure) using data memory store instructions, as opposed to having fixed program code in ROM. Additionally, reading back of program code using data memory load instructions is provided in order to test and verify that the program code has been stored correctly. For example, the Tensilica Xtensa™ architecture has an “Instruction Memory Access Option” which when enabled, allows certain load and store instructions to address instruction RAM or ROM for testing and verification purposes. Tensilica™ teaches that this option is used only for testing and verification because it results in operations becoming significantly slower with a large drop in performance.
FIG. 4 shows that normal Harvard-style operation can proceed once the program code is loaded and running Instruction fetch cycles on the instruction bus can proceed in parallel with data load/store cycles on the data bus.
Careful design can fit the majority of local data structures in the on-chip data RAM, but when hundreds or thousands of these structures need to be instantiated at any given time, the 64 KB of available space is quickly consumed, and alternate storage is required.
As indicated by the dashed box, external Dynamic Random Access Memory (DRAM) is typically employed when more data space is needed than can be handled by the on-chip data RAM. The standard method to manage this situation is to swap out inactive data structures to external DRAM via a DRAM interface. On chip data RAM is generally implemented using Static RAM (SRAM) which is much faster than DRAM and therefore a considerable performance impact is incurred when accessing external DRAM instead of on-chip SRAM.
The present invention was developed in view of the shortcomings of conventional modified Harvard Architecture processors.