1. Field of the Invention
The invention relates to debug support in operating systems, and more particularly to an operating system providing on-chip trace support.
2. Description of the Related Art
The availability of greater integration, lower costs, higher performance and product innovation has fueled rapid expansion of products based on embedded microprocessors. At the same time, the growth in software complexity, coupled with the increasing processor clock speeds, has placed an increasing burden on application software developers. The cost of developing and debugging new software products is now a significant factor in processor selection. In response, a tools industry has evolved to provide a range of often incompatible tools to satisfy hardware and software development requirements.
A processor's failure to adequately facilitate software debug results in longer customer development times and reduces the processor's attractiveness for use within industry. The need to provide software debug support is particularly acute within the embedded microprocessor industry, where specialized on-chip circuitry is often combined with a processor core.
In addition to the software engineer, other parties are also affected by the type and availability of debug tools or involved in their development. These parties include: the "trace" algorithm developer who must search through captured software trace data that reflects instruction execution flow in a processor; the in-circuit emulator hardware developer who deals with problems of signal synchronization, clock frequency and trace bandwidth; and the processor manufacturer who does not want a solution that results in increased processor cost or design and development complexity.
With desktop systems, complex multitasking operating systems are currently available to support debugging. However, the initial task of getting these operating systems running reliably often requires special development equipment. While not the standard in the desktop environment, the use of such equipment is often the approach taken within the embedded industry.
Traditionally, the most powerful piece of debug equipment available to an embedded project has been the in-circuit emulator (ICE). They are most frequently (but not exclusively) used during the early stages of "bringing up the hardware". In many cases ICE equipment is too expensive to be widely available to all project members. In fact typically only software engineers which are somewhat hardware-friendly have the necessary skills required to drive an ICE.
The availability of an ICE gives project engineers the confidence that they can rapidly resolve any difficult development problem they encounter. For this reason, many project teams insist that an ICE be available or they may select an alternative processor. Unfortunately, rising processor complexity, higher clock speeds, use of on-chip instruction and data cache and packaging problems have reduced the availability of ICE. All to often it is quite some time after a processor's introduction before an ICE becomes available, and only then if the processor is widely accepted.
In-circuit emulators do provide certain advantages over other debug environments by offering complete control and visibility over memory and register contents, as well as overlay and trace memory in case system memory is insufficient. Use of traditional in-circuit emulators, which involves interfacing a custom emulator back-end with a processor socket to allow communication between emulation equipment and the target system, is becoming increasingly difficult and expensive in today's age of exotic packages and slinking product life cycles.
Assuming full-function in-circuit emulation is required, there are several known processor manufacturing techniques able to offer the required support for emulation equipment. Most processors intended for personal computer (PC) systems utilize a multiplexed approach in which existing pins are multiplexed for use in software debug. This approach is not particularly desirable in the embedded industry, where it is more difficult to overload pin functionality.
Other more advanced processors multiplex debug pins in time. In such processors, the address bus is used to report software trace information during a BTA-(Branch Target Address) cycle. The BTA-cycle, however, must be stolen from the regular bus operation. In debug environments where branch activity is high and cache hit rates are low, it becomes impossible to hide the BTA-cycles. The resulting conflict over access to the address bus necessitates processor "throttle back" to prevent loss of instruction trace information. In the communications industry, for example, software typically makes extensive use of branching and suffers poor cache utilization, often resulting in 20% throttle back or more. That amount of throttle back is an unacceptable amount for embedded products which must accommodate real-time constraints.
In another approach, a second "trace" or "slave" processor is combined with the main processor, with the two processors operating in-step. Only the main processor is required to fetch instructions. The second, slave processor is used to monitor the fetched instructions on the data bus and keeps its internal state in synchronization with the main processor. The address bus of the slave processor functions to provide trace information. After power-up, via a JTAG (Joint Test Action Group) input, the second processor is switched into a slave mode of operation. Free from the need to fetch instructions, its address bus and other pins provide the necessary trace information.
Another existing approach involves building debug support into every processor, but only bonding-out the necessary signal pins to support e.g., trace capability, in a limited number of packages. These specially packaged versions of the processor are used during debug and replaced with the smaller package for final production. That bond-out approach suffers from the need to support additional bond pad sites in all fabricated devices. That can be a burden in small packages and pad limited designs, particularly if a substantial number of extra pins are required by the debug support variant. Additionally, the debug capability of the specially packaged processors is unavailable in typical processor-based production systems.
The rising cost of ICE and the increase in its unavailability has lead to a search for alternatives. The use of general purpose logic analyzers, with support software, has provided one alternative. However, these tool combinations are generally considered even harder to drive than ICE. The primary reason engineers select an ICE solution is because of its program trace capability. The trace capability of a logic analyzer is the reason engineers resort to their use when an ICE is unavailable.
In yet another debug approach (the "Background Debug Mode" by Motorola, Inc.) limited on-chip debug circuitry is provided for basic run control. Through a dedicated serial link requiring additional pins, this approach allows a debugger to start and stop the target system and apply basic code breakpoints by inserting special instructions in system memory. Once halted, special commands are used to inspect memory variables and register contents.
Typically a project engineer will utilize a ROM monitor when an ICE solution is too expensive or unavailable. These monitors consist of relatively small programs which are located in the target system's ROM or Flash memory. They also typically have a small RAM requirement The monitor program supports control and visibility into the program's register and memory contents, but no trace of program execution. Often projects will be supported with one or two ICEs, with the rest of the software engineers working with a target monitor.
An additional tool available to the embedded project is the ROM emulator. This enables system ROM or RAM to be replaced with a dual ported memory which can be accessed by both the target and host processors. The use of a ROM emulator does provide for fast data and program transfer, which is the primary reason for its selection.
The low cost of ROM monitors make them popular, but their use has several drawbacks: They require ROM and RAM resources to be reserved within the target systems memory. They require an on-chip or off-chip peripheral, such as a Universal Asynchronous Receiver Transmitter (UART), to support communication with the controlling (host) platform. Subsequent updating of the monitor program is often an arduous process.
In recent years there has been greater use of sophisticated tools such as multitasking operating systems, library resources and source-level debuggers, to name only a few. As discussed, complex multi-tasking operating systems are currently available to support debugging with desktop systems. In general, tools for use with PC software development have reached a high level of functionality and simplicity of use. That has not gone unnoticed by engineers in the embedded industry, and there is now a demand for a similar level of tool capability.
Several studies have shown that presently only about 50% of 32-bit embedded systems make use of a multitasking operating systems although this number is growing. One deterrent to use of multitasking operating systems has certainly been cost, but more important has been the perceived complexities of getting the system running. Often there are difficult tool transitions required as debugging proceeds from kernel-mode to application-mode debug. There is also the burden of often having to first get a ROM monitor running before commencing kernel and driver configuration for the particular system.
Thus, the current solutions for software debugging suffer from a variety of limitations, including: increased packaging and development costs, circuit complexity and processor throttle back. Further, there is currently no adequate low-cost procedure for providing trace information. Also, debugging embedded applications utilizing multi-tasking operating systems can result in difficult tool transitions from kernel to application debug. The limitations of the existing solutions are likely to be exacerbated in the future as internal processor clock frequencies continue to increase, software complexity continues to grow and expensive ICE solutions become more and more prohibitive.