The present invention relates to the field of computer networking. More particularly, the invention describes a method for accessing a network with extremely low latency using a programmed I/O in a paged, multi-tasking computer.
In a networked computer system, there is often a need for information to be transmitted from application software in one computer across the network and be received and used by application software in a different computer. During this transmission and receipt process, the information has many hardware and software layers which it must pass through. For example, in a typical networked computer 210, as shown in FIG. 2, for information to be transmitted from the application software 211 to the network 200, it must travel through software layers such as a library 212, an operating system (O/S) 213, and a driver (in the O/S) 214 as well as other hardware 214. Similarly, in order for the application software 221 of the second computer 220 to retrieve the information, the information must travel through components such as hardware 225, a driver 224, an operating system (O/S) 223 and a library 222 before reaching the destination application software 221. The process of the information traveling through these various hardware and software layers takes a significant amount of time, typically around 100 xcexcs or more.
In the past, data transmission over the network was slow compared to the transmission of data from the application software to the network (as described above, this takes around 100 xcexcs or more). However, the speed at which data can be transmitted over the network has been increasing. As the network speed becomes faster, the overhead time associated with the data traveling to or from the network to the application software has proportionally become greater. Therefore, decreasing the overhead time associated with the data traveling to and from the network has become of increasing concern.
In order to share hardware devices among tasks on a multitasking computer, the operating system kernel is typically the only entity allowed to directly interface with them. User tasks interface with hardware devices indirectly by invoking kernel software functions. Over time the performance of network hardware devices has increased relative to the overhead of the kernel software such that it is not possible to take advantage of the increasing speed of the network. In order to provided shared access to a single network hardware device from number of tasks certain problems to be overcome. First, a task message must not interface with message from another task. Second, tasks must not be able to receive into or send out of memory regions that are not their own. Third, on a computer which supports paged virtual memory, tasks must not be able to receive into or send out of virtual memory regions that are not currently resident in physical memory.
Solutions to these problems have been proposed for network hardware devices which employ DMA (Direct Memory Access). FIG. 3 shows one known method of decreasing the overhead time using a DMA transfer. Using DMA, a device (such as hardware, floppy disk drive, CD-ROM, etc.) can transfer data directly to the computer""s memory 310, thereby bypassing the CPU 340. In general, a DMA is a specialized processor (a DMA controller 330) that transfers data between memory 310 and an I/O device 360, while the CPU 340 goes on with other tasks. Thus, it is external to the CPU 340 and must act as a master on the bus. To use DMA, a program only needs to tell the DMA controller 330 how many bytes should be transferred (length) 332 and from what address location (source address) 333 to another address location (destination address) 334. The DMA controller 330 then goes and grabs the information, or message, out of the memory 310.
One issue in using DMA is whether the DMA controller 330 should transfer data using virtual addresses or physical addresses. If the DMA uses physically mapped I/O, then transferring a buffer that is larger than one page will cause a problem because the pages in the buffer will not usually be mapped to sequential pages in physical memory. For example, suppose a DMA is ongoing between memory and a frame buffer, and the operating system removes some of the pages from memory, or relocates them. The DMA would then be transferring data to or from the wrong page of memory. A typical solution to this problem is to use a virtual DMA controller 331. A virtual DMA controller 331 allows use of virtual addresses that are mapped to physical addresses during the DMA. Thus, a buffer must be sequential in virtual memory but the pages can be scattered in physical memory. The operating system can then update the address tables of a DMA if a process is moved using virtual DMA, or the operation system can xe2x80x9clockxe2x80x9d the pages in memory until the DMA is complete. However, keeping the DMA up to date is quite difficult. Therefore, one problem with using a DMA is that it is quite complicated and takes a lot of hardware support. In addition, the DMA controller in a computer is usually inflexible and slow. Therefore a simpler and faster solution is desirable.
FIG. 4 shows another known method of decreasing the overhead time. This method is called programmed I/O, and is often used in single-tasking computers. In this method, the application software 411 bypasses software layers, such as the library 412, O/S 413, and the driver 414, and sends the data directly to the hardware 415, using the CPU (not shown) for data transfers. In a system using programmed I/O, the application software 411 essentially pushes the message to be sent directly into the hardware 415. This method requires that the CPU (not shown) first check to see if the I/O port needing a data transfer has the data ready. If the I/O port is ready then the data is transferred to the memory. One advantage of programmed I/O over DMA is that it is not necessary to worry about keeping track of virtual and physical memory locations. In programmed I/O, since the CPU is used, the synchronization required in DMA is not necessary and therefore implementation is simpler. If an application attempts to retrieve data out of memory belonging to someone else, the CPUs built in safety measures, such as an address fault, comes into play.
In addition, programmed I/O is typically faster than DMA as it reduces the overhead time used to get from a user task to the network, but has been limited to single-tasking computers. The reason for this is because of the fact that in multi-tasking computers, an application runs for a certain quantum of time, and is then swapped out, while another application runs. This typically occurs about ten times per second. In the situation in which an application is in the middle of sending a message and the asynchronous halt comes there needs to be some way of dealing with this situation.
A method for sending a message from an application in one networked multi-tasking, computer to an application in another networked multi-tasking computer using programmed I/O. A communication link is first established between the two applications. When the communication link is available, the hardware associated with the sending application receives bytes of information until the application has been swapped out by the operating system or until the entire message has been received. If the entire message has been received, then it is sent to the other application. However, when the application is swapped out, the hardware sends the portion of the message that has already been received to the other application and continues retrieving information when the application is swapped back in.