A computer system can be broken into three basic blocks: a central processing unit (CPU), memory, and input/output (I/O) units. These blocks are interconnected by means of a bus. An input device such as a keyboard, mouse, disk drive, analog-to-digital converter, etc., is used to input instructions and data to the computer system via the I/O unit. These instructions and data can be stored in memory. The CPU retrieves the data stored in the memory and processes the data as directed by the stored instructions. The results can be stored back into memory or outputted via the I/O unit to an output device such as a printer, cathode-ray tube (CRT) display, digital-to-analog converter, etc.
Traditionally, the CPU consisted of a single semiconductor chip known as a microprocessor. This microprocessor executed the programs stored in the main memory by fetching the instructions, examining them, and then executing them one after another. Due to rapid advances in semiconductor technology, faster, more powerful and flexible microprocessors were developed to meet the demands imposed by ever more sophisticated and complex software.
Presently, the state-of-the-art in microprocessor design has come to a point where designing the next generation of microprocessors is incredibly costly, labor-intensive, and time-consuming. However, new applications, such as multimedia, which integrates text, audio, speech, video, data communications, and other time-correlated data to create a more effective presentation of information, requires a large amount of processing power to handle in a real-time environment. And with the explosion in network and file server applications, there is a need for processing vast amounts of data in a fast, efficient manner. The trend is for even more complex and lengthier software programs. The computational requirements associated with operating these applications in real-time is starting to overwhelm even the most powerful of microprocessors.
One solution is to implement multiple processors. A singularly complex task can be broken into sub-tasks. Each sub-task is processed individually by a separate processor. This use of multiple processors allows various tasks or functions to be handled by more than one CPU so that the computing power of the overall system is enhanced. And depending on the complexity of a particular job, additional processors may be added. Furthermore, utilizing multiple processors has the added advantage that two or more processors may share the same data stored within the system. In order to further boost the performance of the system, specialized controllers, graphics accelerators, digital signal processors, co-processors, etc., are being implemented.
Consequently, the interface between all these various devices can be rather complex. Devices interfaced to the microprocessor need to be informed as to certain status of the microprocessor. For example, these devices need to be informed as to whether the microprocessor has stopped running and the reason why it has stopped running. In other instances, the microprocessor needs to send an acknowledgment that a particular signal has been asserted or acknowledgment that it has been synchronized to a particular signal. Furthermore, branch trace messages are used by devices external to the microprocessor to trace execution for debugging and performance monitoring purposes.
In addition to the problem of issuing special bus cycles, there is a need for a mechanism to request cache and data translation lookaside buffer (DTLB) flushes. In the case of DTLB flushes, both single line and entire TLB flushes are required. Only single line flushes are required for the data cache unit (DCU).
One method for handling the additional interfacing would be to add extra hardware. However, this solution is relatively expensive, consumes circuit board space, and is not easily expandable.
Thus, there is a need in the prior art for an apparatus and method for issuing special bus cycles without incorporating extra, dedicated hardware. It would be preferable if such an apparatus and method were to have the flexibility of accepting additional functions, such as performing particular cache and TLB functions (e.g., flush, invalidate, etc.).