1. Technical Field
The present invention relates generally to computer architecture and, more specifically, to methods for dynamically configuring bus byte lanes.
2. Description of Related Art
In typical computer systems utilizing processors, system developers desire optimization of execution software for more effective system design. Usually, studies of a program""s access patterns to memory and interaction with a system""s memory hierarchy are performed to determine system efficiency. Understanding the memory hierarchy behavior aids in developing algorithms that schedule and/or partition tasks, as well as distribute and structure data for optimizing the system.
Performance monitoring is often used in optimizing the use of software in a system. A performance monitor is generally regarded as a facility incorporated into a processor to monitor selected characteristics to assist in the debugging and analyzing of systems by determining a machine""s state at a particular point in time. Often, the performance monitor produces information relating to the utilization of a processor""s instruction execution and storage control. For example, the performance monitor can be utilized to provide information regarding the amount of time that has passed between events in a processing system. The performance monitor can also be used to provide counts of the number of occurrences of selected events in a processing system. The information produced usually guides system architects toward ways of enhancing performance of a given system or of developing improvements in the design of a new system.
Current architectures for performance monitors utilize a method whereby all signals are simultaneously routed to the central performance monitor unit. However, this increases the chip area required to implement the performance monitor and increases the wiring congestion. Furthermore, having all signals simultaneously routed to the performance monitor unit limits the number of signals delivered to the performance monitor. Therefore, a circuit architecture for a performance monitor that decreases chip area and wiring congestion is desirable. Furthermore, it is desirable to have a performance monitor bus that can potentially provide a larger number of signals to a performance monitor unit than is possible with current performance monitor signal routing designs.
The present invention provides a byte lane selectable performance monitor bus. In a preferred embodiment, the performance monitor bus includes a plurality of byte lanes and a selection unit. The selection unit selects, from a plurality of signals, a smaller subset of these signals, which are desired to be monitored, and places this subset of signals on the byte lanes. The number of the plurality of signals that potentially may be monitored is greater than the number of byte lanes.
In one preferred embodiment, four selection stages are utilized to select a 32-bit input for a performance monitor unit from multiple 64-bit signal groups. Each selection stage utilizes four multiplexers. The first stage of multiplexers selects four 64-bit signals from a plurality of sources. Each of the four 64-bit signals is broken up into an upper and lower 32-bits from which the second stage of four multiplexers chooses either the upper or lower 32-bits. Each of the 32-bit outputs from the second stage is broken up into four 8-bit components which are fed into a third selection stage.
The third selection stage comprises four multiplexers. The inputs to the first multiplexer are the first 8-bits from each of the outputs of the second selection stage. The inputs to the second multiplexer are the second 8-bits from each one of the outputs of the second stage. The inputs of the third and fourth multiplexer stages are chosen similarly. Each of the four multiplexers in the third selection stage selects one of the four inputs as an 8-bit output.
A fourth selection stage also comprises four multiplexers. These multiplexers select either the 8-bit output from the third selection stage or an 8-bit signal from the memory system. The chosen output is placed on four byte lanes, which are the input to a performance monitor unit.