Complex embedded processor chips now use a wide variety of system configurations and are customized for a broad range of applications. Each design typically includes cache memory, but the range of complexity of even the cache memory varies significantly.
Many processors have complex memories that are partitioned into levels. The designation local memory is used for the innermost level of memory most tightly connected to the processor core. This local memory has the best possible performance because it will strongly affect processor throughput. Larger cache memories interfacing with the local memory and the processor itself are sometimes split into level 1 program cache memory and level 1 data cache memory. These caches have aggressive performance requirements as well, but generally are slightly lower in performance to the local memory portion.
Finally at the outer extremes of the embedded processor core, memory is designated the level 2 cache. Level 2 memory is sometimes quite large and often has more moderate speed performance specifications.
The process of designing an embedded processor or customizing a given embedded processor design for specific applications involves much analysis of all parts of the device. In addition, the computer program that is stored in the device must be developed and debugged in simulation or emulation. Conventional design has one or more controllers that must function in harmony to realize an efficient processor. These controllers are often divided into core control, memory control and peripheral control.
FIG. 1 illustrates an example of a conventional high-level processor block diagram. The processor core 100 operates under control of the core control logic block 101. The processor core 100 accesses its most critical data from the local memory 102, and receives its program instructions from level 1 program cache 103 and additional data from level 1 data cache 104. The task of the memory control block 105 is to drive the level 1 program cache 103, level 1 data cache 104, level 2 cache 106 and local memory 102 in a coherent fashion. Level 2 cache 106 is the buffer memory for all external data transactions. Large external memories normally have performance specifications not qualifying them for direct interface with the core processor. Program or data accesses having target information available only in external memory are buffered through level 2 cache 105 primarily, avoiding possible system performance degradation.
It is becoming increasingly common to process all other transactions through an integrated transaction processor here designated as enhanced direct memory access (EDMA) and peripheral control 112. Data can be brought in through the external memory interface 107 which can be externally connected to very high storage capacity external memories. These external memories have performance specifications not qualifying them for direct interface with the core processor.
The state machine and system control logic 113 is the highest level of control on this example processor. It interfaces with each of the other control elements, the memory control unit 105, the processor core control logic 101 and the enhanced direct memory access (EDMA) peripheral control element 112.
Complex multi-level cache memories are defined at a high level by a memory protocol. The memory protocol can be a set of statements about the modes of operation which the overall memory must placed to accomplish a requested task while it is in a pre-determined state. This protocol is typically reduced to a set of flow diagrams that define a state machine function. The designer often begins at this point with a well-defined protocol and an existing state machine, both of which may need to undergo some modification. The designer's first step is to write the computer program that will drive the system/memory operations. Analysis tools designed to debug new designs are available from a number of software vendors. Typically a number of simulation runs must be made and some adjustments in the code made between each run.
FIG. 2 illustrates the flow diagram of a conventional cache memory computer program analysis-synthesis process. Code development 200 is normally carried out using a conventional word processor text editor 210 and the trial code is compiled using one of several possible conventional toolsets. The compile step 201, assemble step 202, and link step 203 are normal processes preliminary to the simulation 204. Because of the many possible system configurations and the wide variety of applications being analyzed, the engineer must analyze the simulator output results manually in step 205. The engineer must then decide if the results are satisfactory in step 207. This standard of behavior could be a standard of computer timing behavior, computer processor loading or any other parameter crucial to acceptable performance of the computer program. If not, the engineer initiates code changes and a new simulation. In FIG. 2, path 208 indicates a positive outcome, the results are acceptable. This leads to the finish state. On the other hand path 209 indicates the negative outcome, the simulator results are not acceptable. This leads to a loop in the flow chart through manual adjust and edit step 206. Flow returns for another trial. Typically the second trial pass will begin at the re-compile step 201 and will proceed through all the previous conventional toolset steps.