1. Field
The present invention relates to a signal processing system on chip (SoC) including a central processing unit (CPU) and multiple computing elements, and in particular, the present invention relates to a methodology for implementing breakpoints and debugging during the processing of the CPU and the multiple computing elements.
2. Related Art
Since the 1990's, integrated circuit (IC) design has evolved from a chip-set philosophy to an embedded core based system-on-chip (SoC) concept. An SoC integrated circuit includes various functional blocks, such as microprocessors, interfaces, memory arrays, and digital signal processors (DSP). The resulting SoCs have become quite complex. Moreover, the techniques used in the design of these SoCs have not scaled with the complexities of chip designs. In addition to prior testing of the component functional blocks, the interfaces between the blocks are functionally verified by various well-known techniques. Preventive steps include writing many vectors to check the functionality of a device and running code coverage tools to evaluate the test results. Scan chain testing is well-known in the prior art and permits determining the internal states of various memories and registers contained in the functional block. Frequently, problems in the resulting SoC are encountered in spite of these levels of testing. Moreover, if there are problems in a design after the device has been fabricated, it may be extremely difficult to determine the cause of the problems. This difficulty can be attributed to the number of functional blocks that are potential sources of the problem and the lack of visibility of the internal operation of the SoC device. Additionally, the operation of the device can differ significantly from the simple functional vectors that are typically used to verify the interfaces of the functional blocks.
In spite of such efforts, functional problems do occur in fabricated devices. The likelihood of functional problems occurring increases with the complexity of the SoC. For such complex systems, it is virtually impossible to write vectors to test all the different combinations of functional operation of functional blocks. Moreover, there may be functional features that the designer did not think about testing. Further, the functional problem may occur after extended periods of operation and accordingly cannot be easily detected by running simple test vectors.
When functional problems do occur with fabricated SoCs, designers attempt to determine the cause by observing the state of internal registers, internal memories, or by monitoring the outputs of the pins to the device (e.g. by various prior art means such as test probing of the device pins as well as more sophisticated methods employing computer driven debugging interfaces). Often, there is insufficient visibility to the internal state of the SoC device. In such cases, the designer must speculate as to what the cause of the functional failure is. As a result, it may take several revisions to the circuit design before the problem is corrected.
There is thus a need for, and it would be highly advantageous to have, a methodology for debugging a system-on-chip including multiple functional blocks, e.g. CPU and multiple computing elements.
Reference is now made to FIG. 1 which illustrates a conventional system on chip (SoC) 10 including a CPU 101 and multiple computing elements 109 connected by a crossbar matrix 111. System 10 includes shared memory 103 and a shared direct memory access (DMA) unit 105 for accessing memory 103. Alternatively, conventional system 10 may be configured with a bus and bus arbiter instead of crossbar matrix 111. When CPU 101 runs a task on one of computing elements 109, CPU 101 transfers to computing element 109 a task descriptor including various parameters: a desired operation (opcode) and operands specifying the task and then instructs computing element 109 to start processing the task. The specific opcode is preferably supplied within a command word which also includes various control bits. CPU 101 then monitors the completion status of each computing element 109 in order to obtain the respective results and prepares further tasks, on a task by task basis, for each computing element 109.