The present invention pertains to the field of computer system bus architectures. More particularly, the present invention relates to a method and system for optimizing host buses that directly interface to 16-bit PCMCIA host bus adapters.
A bus architecture of a computer system conveys much of the information and signals involved in the computer system""s operation. In a typical computer system, one or more busses are used to connect a central processing unit (CPU) to a memory and to input/output elements so that data and control signals can be readily transmitted between these different components. When the computer system executes its program, it is imperative that data and information flow as fast as possible in order to make the computer as responsive as possible to the user. With many peripheral devices, such as graphics adapters, full motion video adapters, small computer systems interface (SCSI) host bus adapters, and the like, it is imperative that large block data transfers be accomplished expeditiously. These applications are just some examples of subsystems that benefit substantially from a fast efficient bus transfer rate.
Much of a computer system""s functionality and usefulness to a user is derived from the functionality of the peripheral devices. For example, the speed and responsiveness of the graphics adapter is a major factor in a computer system""s usefulness as an entertainment device. Or, for example, the speed with which video files can be retrieved from a hard drive and played by the graphics adapter determines the computer system""s usefulness as a training aid. Hence, the rate at which data can be transferred among the various peripheral devices often determines whether the computer system is suited for a particular purpose. The electronics industry has, over time, developed several types of bus architectures. Recently, the PCMCIA bus architecture has become one of the most widely used, widely supported bus architectures in the industry, particularly with respect to mobile computer devices such as xe2x80x9claptopxe2x80x9d computers. The PCMCIA bus was developed to provide a moderate speed, low latency bus architecture from which a large variety of removable devices could be developed.
Referring now to prior art FIG. 1, a typical PCMCIA bus system 100 in accordance with the prior art is shown. As depicted in FIG. 1, system 100 includes a PCMCIA bus 115 coupled to a 16-bit host bus adapter 104 and a PCMCIA device 120. PCMCIA device 120 is coupled to PCMCIA bus 115 via a socket 116. PCMCIA device 120 is a typical removable device, such as, for example, a removable modem, hard disk drive, ethernet adapter, or the like. In a typical implementation, system 100 includes two or more sockets 116 configured to accept removable PCMCIA devices. Host bus adapter 104 is coupled to a Southbridge 103 which is in turn coupled to a Northbridge 102. The Northbridge 102 is coupled to the host x86 processor 101 via the processor local bus 110.
The PCMCIA specification encompasses two bus standards. The standards are referred to as the PC Card 16 standard and the Cardbus standard. The PC Card 16 standard is a 16-bit xe2x80x9cISA likexe2x80x9d interface (industry standard architecture) and is the most widely implemented. The Cardbus standard is more recent and closely follows the PCI (peripheral component interconnect) protocol. System 100 of FIG. 1 is a PC card 16 compatible system.
Referring still to FIG. 1, with PC card 16 compliant systems (e.g., system 100), the host processor 101 communicates with PCMCIA device 120 through the host bus adapter 104. The host bus adapter supports the PCMCIA standard protocols and functions as a bridge between the typical x86 architecture of system 100 and the PCMCIA compliant devices. PCMCIA HBAs (host bus adapters) are typically found in laptop computers where power, portability, and throughput are of main concerns. System 100 of FIG. 1 is one such system. Components 101-115 are located within the laptop computer, while the external PCMCIA device 120 is coupled to the laptop computer via the connector 116. Within the laptop computer, the north bridge 102 usually contains high-speed, high-performance devices and the south bridge 103 acts as an interface to slower peripheral devices.
It should be noted that in the architecture of system 100, the speed at which the PCMCIA devices (e.g., PCMCIA device 120) operate and hence the activities on the HBA 104 does not impact the performance on the processor local bus 110. 16-bit PC Card devices such as PCMCIA device 120 are typically slow devices, and hence, the associated xe2x80x9cPC-16 cardxe2x80x9d interface is typically a slow interface. Due to legacy reasons (e.g., compatibility with legacy systems), the PCMCIA cards run at ISA frequencies ranging from 4.77 MHz to 8 MHz. Based on the speed of the coupled external device (e.g., PCMCIA device 120), the HBA 104 holds the host bus (e.g., the bus between HBA 104 and Southbridge 103) or retries the host (or any other bus master) until the transaction can be completed on the PCMCIA bus 115.
With advancements in semiconductor technology, the use of embedded microprocessors inside ASICs (application-specific integrated circuits) is very common. In order to have access to the large number of PCMCIA devices on the market, it is imperative that these embedded microprocessor based ASICs support PCMCIA HBAs so that any PCMCIA device can be xe2x80x9cplugged inxe2x80x9d and used.
There is a problem, however, in that embedded microprocessor based systems do not typically support multiple levels of bus hierarchy. In order to minimize access latency, the bridges required to implement multiple levels of busses are eliminated. A typical implementation of a PCMCIA host bus adapter in an embedded system is illustrated in prior art FIG. 2.
Prior art FIG. 2 shows a typical embedded PCMCIA bus system 200 in accordance with the prior art. Prior art system 200 includes a host processor 201, a processor local bus 210, and a host bus adapter 204. Components 201, 210, and 204 of system 200 function in a substantially similar manner as their corresponding components 101, 110, and 104 of system 100 of FIG. 1. System 200 is an example of an embedded system implemented as a single ASIC. For example, processor 201 can be an ARM (advanced RISC microprocessor) processor and the local bus 210 can be an ASB bus (Advanced System Bus).
Thus system 200 shows one problem associated with the design of host bus adapters that interface directly to the processor local bus in embedded microprocessor ASICs. For example, in the case where HBA 204 interfaces with device 120 that supports IO transactions and houses an I/O device, HBA 204 can occupy and monopolize the processor local bus 210 while waiting for the I/O transactions with device 120 to complete. For example, assuming that the access latency of the device 120 is such that it inserts wait states on accesses from the PCMCIA HBA 204, and assuming the processor local bus 210 operates at 100 MHz while the HBA 204 runs at 8 MHz, when the host processor 201 reads or writes to an IO device 120, the device interface asserts the WAIT# signal for 250 ns. This extends the host bus adaptor""s cycle to one more cycle until WAIT# is deasserted. If the host bus is held off by the HBA 204 until this transaction is complete, the entire transaction takes at least 500 ns, which translates to the processor local bus 210 not being available for other devices/masters for 50 clock cycles. This has a very significant adverse impact on performance and the throughput on the local bus 210. It should be noted that WAIT# signals can be asserted for a maximum of 12 us. Thus, if the HBA 204 reflects the transaction on the PCMCIA bus 115, then the processor local bus 210 is unavailable to other resources for an excessively long time.
Referring still to prior art FIG. 2, typical solutions to this problem would include a retry mechanism by which the HBA 204 retries the processor local bus 210 until the transaction can be completed. However, simply retrying the processor local bus 210 on every access until the device 120 is ready does not improve the situation significantly. For example, every time a master on bus 210 comes back to access device 120 through HBA 204, the processor local bus 210 is still being used by the master and the HBA 204 as HBA 204 has to respond with retry cycles. Further, retry mechanisms typically include arbitration latency and additional switching activity on the processor local bus 204, thereby causing unnecessary power utilization. Additionally, in some prior art implementations, HBA 204 is configured to start a data transaction, sample the WAIT# signal to see if device 120 is ready, and then asserts a response on processor local bus 210.
Thus, what is required is method and system which minimizes the local bus monopolization problems that occur where a faster device on the processor local bus of embedded microprocessor ASIC continually attempts to access a slower PC card device on a PCMCIA bus. The required system should minimize local bus monopolization problems without adding additional switching activity or latency to the operation of the local bus. The required system should significantly increase the available local bus data transfer bandwidth available to embedded microprocessor applications. The present invention provides a novel solution to the above requirements.
The present invention provides a method and system that minimizes the local bus monopolization problems that occur where a faster device on the processor local bus of embedded microprocessor ASIC continually attempts to access a slower PC card device on a PCMCIA bus. The system of the present invention minimizes local bus monopolization problems without adding additional switching activity or latency to the operation of the local bus. In addition, the system of the present invention significantly increases the available local bus data transfer bandwidth available to the embedded microprocessor.
In one embodiment, the present invention is implemented as a data transaction access system for an embedded microprocessor coupled to a PCMCIA bus device. In this embodiment, the data transaction access system is integrated with the embedded microprocessor as a single ASIC. The system includes a processor local bus, at least one bus master (e.g., the microprocessor) coupled to the local bus, and a host bus adapter coupled to the local bus for enabling communication between the bus master and a PCMCIA device. The PCMCIA device is coupled to the host bus adapter via a PCMCIA bus. The bus master uses the local bus to communicate with the PCMCIA device via the host bus adapter. A wait register is coupled to host bus adapter to receive a delay input from the PCMCIA device describing a latency period of the device when completing a data transaction. Where the latency period described by the delay input is less than a predetermined amount, the host bus adapter is configured to insert wait states into the data transaction of the bus master. Where the latency period is greater than the predetermined amount, the host bus adapter is configured to retry the data transaction of the bus master.
In a second embodiment, the wait register is adapted to couple the delay input to the bus master such that the bus master initiates a subsequent access to the PCMCIA device at the expiration of the latency period. After the initial access is retried, the bus master reads the delay input stored in the wait register and attempts the subsequent access in accordance with the delay specified. In so doing, the data transaction is completed without numerous wasted retries being imposed on the processor local bus.
In a third embodiment, the embedded system includes an arbiter for controlling access to the processor local bus. The arbiter is coupled to read the delay input stored in the wait register. The arbiter ensures efficient utilization of the processor local bus by not granting the local bus to the requesting bus master for a subsequent access until the expiration of the latency period. Once the latency period expires, the arbiter grants to local bus to the bus master and the bus master completes the data transaction.