The present invention relates to transferring data files between geographically separated computers, and, in particular, to implementing a Parallel File Transfer Protocol to transfer such files more rapidly than current methods permit.
The prevailing method by which users transport information between geographically separated computers is to use the File Transport Protocol ("FTP") to send the information over the Internet. This method, however, makes it difficult to transfer large data files across long distances over the Internet because of small window size and propagation delay.
Of the many protocols available for transferring files, TCP/IP (which stands for Transmission Control Protocol/Internet Protocol) is the one most widely accepted for the Internet. TCP ensures reliable transfers by transmitting data in separate packets. Each packet contains no more than 64 kilobytes ("Kb"). A "sliding window" or "handshake" protocol partitions the transmission into three distinct phases. The first phase represents data ready to be sent. The second phase represents data that is either in transit or has arrived but has not yet been acknowledged. The third phase represents data that arrived successfully and has been acknowledged (See A. S. Tanenbaum, Computer Networks, Englewood Cliffs, N.J., Prentice Hall, at 429 and J. Postel and J. Reynolds, "File Transfer Protocol (FTP), RFC 959", USC ISI, October 1988). Thus a 256 Kb file is broken into four 64 Kb packets, each of which passes sequentially through all three phases.
This mode supports transfers of small files. Though it transfers large files successfully between sites in close proximity, it degrades when transporting large files over long distances. There is a drop in throughput that results from two limitations. First, the bandwidth on the lines is limited. Second, the "pipe" between sites is not fully utilized. The sending machine could transmit much more data in the time it loses waiting for acknowledgments. These acknowledgments are checksums returned from the receiver to verify correct delivery of each packet sent.
An ideal transmission would stream data, error free, to arrive as quickly as it is sent, keeping the line fully occupied. In reality, however, much transfer time is wasted, because Internet errors dictate that the operation must transfer the data in very small packets sent in lock step. Each packet has to wait until the previous packet has been transmitted successfully and the successful transmission acknowledged. Thus the transfer has been drastically slowed down.
Two prior-art implementations have tried to solve these problems. Multiple File Transfer Protocol ("MFTP"), by lannuci and Lekashman at NASA, provides (1) multiple connections between the client and the server and (2) variable window sizes. This solution, available only by request, requires the server to have the MFTP daemon running continuously. Users must also execute MFTP "clients" to transport data to/from the server. Though the client can be compiled and executed by a system user, the server must be compiled and installed only by a user who has system root privileges. As the MFTP server is an add-on to Unix Internet services, it requires that the Unix operating system be installed and running on the server.
A second solution to the problems of transferring large files over long distances is Starburst Multicast File Transfer Protocol.TM. ("SMFTP") from Starburst Communications. SMFTP provides both one-to-one and one-to-many (multicast) transmissions. SMFTP has two design differences from MFTP:
1. The transmission process is offloaded to a Multicast server that replicates and transmits the data.
2. The transmission window is extended to the entire length of the file. Error correction follows transmission of the complete file. SMFTP also requires the user to install a server (as root) and various clients who wish to transport to or from the server.
Neither of these prior-art solutions is satisfactory. MFTP requires that the MFTP server must exist on any server from which a user wishes to transport a file. And MFTP is neither standardized nor embraced by vendors of Unix workstations. Until MFTP is standardized and accepted by the Unix community, MFTP cannot be a general solution to the file-transfer problem, because MFTP must be "root" installed into a system's Internet service. Any security-conscious network administrator will object to the installation of any non-standardized third-party Internet services software for fear of hidden or unknown entry points that can compromise the system.
SMFTP also requires "root" installation into a system's Internet services. Thus both these prior-art solutions require that a separate server be installed at every site from or to which the user wishes to transfer data. Therefore they are acceptable only to a distributed commercial organization willing to make such an installation. Neither offers an acceptable solution for the mainstream of users, since the number of public sites running either MFTP or SMFTP servers is almost negligible.
Thus there exists a need for a solution to the problem of transferring large files over long distances that does not require a proprietary installation.