The present invention relates generally to data processing systems, and in particular, to bulk data distributions within networked data processing systems.
As network systems increase in size, the efficiencies in transferring data are becoming more important to the effective use of networking resources. Presently, when data is being transferred through the network, there exists the possibility that an interruption will occur during the transfer. Such an interruption might have a minimal effect on the system as a whole, especially if the data packet being transferred is small. However, if the transfer is a large bulk transfer of data through the network, the interruption could have a great effect on system resources.
This occurs because during a data transfer within a network there is presently no method of recording or maintaining track of the amount of data that has been successfully received by the recipient. Thus, when an interruption occurs during the transfer, the sending machine has no method of knowing what portions of the transfer was received and where to properly restart the transmission once the interruption is cleared. Hence, the sending machine must assume that recipient machine either failed to receive any portion of the transmission, or was unable to store any of the data packet that was being transferred prior to the interruption. Therefore, the source system must restart the data transmission at the beginning. If most of the data transfer was complete prior to the interruption, this might mean repeating a large portion of the data packet transfer that it had previously been successfully received, albeit unrecorded.
As can be seen, if a large bulk data transmission is occurring and is interrupted near the end of the transmission, the sender is effectively sending the transmission twice for the same endpoint user upon recommencing the transmissions. Multiplied over a large network, the inefficiencies due to interruptions to the network can cause significant slow downs to either the transfer or the network itself
Hence, a need in the art exists such that upon an interruption to a data transfer, the system automatically resumes from the last data point transmitted, rather than at the beginning of the data packet.
The aforementioned needs are addressed by the present invention. Accordingly, there is provided, a method of maintaining a consistent record of data portions that are transferred from a sender and received by the endpoint machine. Thus, when a distribution is interrupted due to a network failure, machine reboot, power failure, or the like, the distribution is automatically resumed from the last data portion successfully transferred and stored on the receiver.
After the end checkpoint is reached for the data portion, the receiving system flushes the file buffers associated with the current distribution to a nonvolatile storage medium. For each repeater and endpoint, it is possible to preselect the amount of data transferred between two checkpoints. The interval between checkpoints may be configured in accordance with the type of connection between an endpoint and the network. There may be a tradeoff between the time spent flushing disk buffers versus the time spent receiving data.
The foregoing has outlined rather broadly the features and technical advantages of the present invention in order that the detailed description of the invention that follows may be better understood. Additional features and advantages of the invention will be described hereinafter which form the subject of the claims of the invention.