1. Technical Field
The present invention is generally related to network data transfers and in particular to multicast network data transfer in a Trivial File Transfer Protocol (TFTP) environment. Still more particularly, the present invention is related to a method and system for tracking missing packets in a multicast TFTP network environment.
2. Description of the Related Art
Data communications involves the transfer of data from one or more originating data sources (sender/server) to one or more destination data receivers (receiving client). The transfer is accomplished via one or more data links between the one or more origination data sources and destination data receivers according to a communication protocol. Thus, a data communications network comprises three or more communicating entities (e.g., origination and destination data sources) interconnected over one or more data links.
In data communications networks, messages or files are usually split into small packets to conform to the communications protocol being utilized. For typical message traffic between digital processing sender-receiver pairs in the network, an average sized message may be split into many smaller packets, and these packets are transmitted and then reassembled in the proper order at the receiving end. To keep track of the packets in such a transmission system, the packets are assigned sequence numbers at some point at the transmitting end, either by the host computer or sending terminal, or by a protocol processor downstream of the host. At the receiving end, the receiving processor keeps track of the sequence numbers and correlates incoming packet traffic by sequence numbers. The transfer system thus has to keep track of sequence numbers in a large number space (e.g., a 32-bit sequence number) so that it can track when the entire message has been transmitted and received or when packets are lost during transmission.
Traditional network transfers typically involved a single sender (server) transmitting a file to a single receiving client. This is referred to as a unicast transfer. One quickly evolving form of file transfer in the new Internet-based networks is that of multicast transfer. With a multicast transfer, a single server is able to send one copy of each packet simultaneously to a group (i.e., more than one) receiving clients. A multicast packet is a packet that is replicated and sent to multiple destination network ports. This replicated packet can be an exact duplicate of the same packet, a modified form of the original, or a combination of these.
The Trivial File Transfer Protocol (TFTP) is commonly utilized to support multicast transfer. TFTP is an Internet software utility simple transfer protocol utilized for transferring files. TFTP is utilized where user authentication and directory visibility are not required. When transferring data utilizing TFTP, the packet or block of data is sent utilizing the User Datagram Protocol (UDP) (rather than on the Transmission Control Protocol) and then the sender waits for an acknowledgment (ACK) from the receiving client. When an acknowledgment is received by the sender, the next packet/block of data is sent.
As a default with TFTP, the file is sent in fixed length blocks of 512 bytes. Each data packet contains one block of data. Receipt of a data packet of less than 512 bytes signals termination of a transfer. If a packet gets lost in the network, the intended recipient will time-out and may retransmit his last packet (which may be data or an acknowledgment), thus causing the sender of the lost packet to retransmit that lost packet. The sender has to keep just one packet on hand for retransmission, since the lock step guaranteed that all older packets have been received by the controlling client. The protocol's implementation of fixed length blocks make allocations straight forward, and the lock step acknowledgment provides flow control and eliminates the need for the controlling client to reorder incoming data packets.
When a network client wishes to initiate a multicast transmission, the receiving client sends a first request with the multicast option/tag to the expected sender (server client). The server returns an acknowledgment and then begins to transmit the packets in a sequential order. The first packet is transmitted and when received by the recipient client, the recipient client generates an ACK that is transmitted to the sender. Receipt of the ACK by the sender enables the sender to then transmit the next (second) packet. In a multicast environment, several clients may simultaneously listen in on a session and receive a copy of the packets being transmitted. Typically the first client to submit a request to begin transmission, referred to as the Master Client, is responsible for sending ACKs. The other clients may commence listening at any time during the transmission. Upon detection of a last packet (i.e., a packet with less than 512 bytes) or occurrence of a time-out while waiting to receive the next packet, a new master client will re-open the file transfer, request specific packets that are missing, and begin sending ACKs for the packets received until it has received all the packets. Any subsequent clients can start receiving blocks of a file during a transfer and then request any missing blocks when that client becomes the master client. Once the client no longer hears packets being transferred during a time-out period, the client re-opens the session and resumes the role of the new master client by requesting missed packets.
Thus, when transmitting a file in multicast, one receiving client operates as a controlling client or master. This is typically the client that issued the request to the sender to begin transferring the file. Any number of clients may simultaneously listen in on a transfer and receive a copy of the packet(s) being transmitted. Each client receives the packet, however, only the controlling client may send an ACK indicating the correct receipt of the packet. Thus, although the other clients are listening in, they are at the mercy of the control client. Each of the other clients may also start (and stop) listening in at a different time and thus may not receive all parts of an initial transmission. Additionally, if one of the other clients does not receive the packet, (the packet is lost) that client cannot immediately re-request the packet, because it is not the controlling client. That is, the time-out condition described above occurs only for the controlling client.
When the packet is received, the client tracks the packet utilizing the sequence number included within the packet's header. This information is utilized to place the packet in its correct location in the file space so that the receiving client can track when it has received all the packets of a file and reassemble the file. To accommodate the tracing operation, current client systems are provided a 64 Kbit tracking array within their memory subsystems. When a packet is received, the sequence number is read from the packet's header and the location in the array space corresponding to the packet's sequence number is set to indicate receipt of the specific packet (e.g., the bit is set to 1).
After the file transfer is completed for the controlling client, each empty space in the array (i.e., a space with bit value 0) of the other receiving clients indicates that the corresponding packet identified by the sequence number has not been received. The spaces may be as a result of the other clients beginning to listen in after the start of the transmission or ending their listening before the end of the transmission, or due to lost packets during transmission.
When the first controlling client ends his transmission, another receiving client then assumes the role of the controlling client and begins to request those packets that were not received as indicated by the holes in the array of the new controlling client. The controlling client thus has to re-request the transfer of only these packets by referencing the array to determine which sequence numbers are missing. Since only the controlling client may re-request transfer of packets not received, only packets lost by a current controlling client are re-requested, and the other receiving clients have to wait their turn to gain control of the transmission.
Traditionally, multicast network transmissions systems supported 64K packets. That is the total number of packets within a file was typically smaller than 64K and the sequence number for each packet ranged in value from 1 to 64K−1. Standard protocol operation provides that each packet carries 512 byte of a file content. Thus, each packet size is 512 bytes (half of a KByte) and the sequence number space within the header is 16 bits. Thus, the current networks can support transfer of a file up to 32 MBytes (and 500+ Mbyte file may be transmitted with lager block sizes). However, with today's increased utilization of file transfers and of proliferation of increasingly large files, e.g., image files, much larger than 32 MB, a need has developed to be able to transmit files in the Giga byte range. Creating larger packet sizes has been suggested to handle much larger files; however, many network routers can only handle a maximum transmission unit size of 1500 Mbytes. This forces the routers of the origination terminal to split the large packets into sub-packets prior to issuing them out to the network. These sub-packets are then rejoined at the destination terminals. This process causes performance degradation since the fragmenting and recombining of the packet requires additional processing, which takes time. Also, portions of each packet may be routed along different network paths and may arrive later than other portions or be lost during transmission.
Thus, in order to efficiently complete such a transfer without requiring the fragmentation and recombining of packets, a much larger number of packets, each having a unique sequence number, are required. To accommodate these larger number of packets the size of the packet identification (sequence) number within the packet header has been expanded from 16 bits to 32 bits. With a 32 bit identifier up to 4 billion or more packets can be uniquely identified.
For files that contain less than 216 packets, the current 64 Kb array is large enough to accommodate the tracking of files that are less than 32 MB. However, with a file containing 4 billion packets, a 500 MB+ memory/storage space is required. This is inherently hardware, real estate and cost prohibitive.
In the original Intel PXE MTFTP SDK implementation, which supported a maximum of 16384 packets, each packet was tracked individually in an array of 2048 bytes, mt-pkg. In this implementation, the transfer identifiers (sequence numbers) utilized by software developers kit (SDK) of TFTP are between 0 and 65,535 (i.e., 216−1 or 64 −1). With IBM LCCM MTFTP implementation, however, which introduces the 32-bit packet sequence numbers support for up to 4,294,967,296 packets is required. If the same mt-pkg scheme was utilized for these sequence numbers a 500 MB array would be required.
One approach proposed to the problem involves keeping track of the first packet received, the last packet received, and all packets missed in between up to a reasonable maximum (i.e., a threshold value). If the number of lost packets exceeds this maximum, then all data is thrown away and the transfer is restarted. This method becomes very tricky when multiple multicast clients all start listening in to a multicast at different points. For example, client 1 starts at packet 1, client 2 at packet 1000, client 3 at packet 10000, and client 4 at packet 12000. Assuming packet 14000 is the end of the file, client 4 receives packets 12000–14000, but misses packets 12200 and 12455. Thus, client 4 marks 12000 as its starting point and 14000 as its ending point and puts 12200 and 12455 into its missing array. If client 2 reopens the multicast and reads packets 1 to 999, then client 4 now marks 1 as its starting point but must add all packets between 1000 and 11999 to its array.
As is clear, this leads to a large number of “lost” packets within the array. However, the method then allows for switching to a range approach and having an array of ranges. Now, if client 3 reopens the transfer and requests packets 1000 to 9999, client 4 receives these packets and, to make sure the packets are not re-requested later, client 4 needs to adjust its missing packet range. This requires a sequential search of all lost packet range entries and/or maintenance of an ordered binary tree to find if a new packet falls in one of the ranges. If the number of packet range entries is large, this overhead can be prohibitive. If the number is small, then the tracking method is likely to surpass its maximum and forces client 4 to restart the transfer from scratch.
Another prior art approach is provided in U.S. Pat. No. 5,151,899, in which sliding windows are utilized with a hierarchical bitmap scheme. In this approach an algorithm is utilized which discards packets outside the sliding window. However, this hierarchical mapping scheme is static (i.e., limited to 32 packets per group) and does not address a fix for the large memory required for the hierarchical group map 2**32/32 addressing also does not address the basic problem of balancing memory usage and re-requests to the host.
In light of the foregoing, the present invention recognizes that it would be desirable to provide a method, system, and program product for efficiently tracking lost packets of a file being transferred on a multicast network to a receiving client where the tracking system is dynamically scalable to accommodate extremely large files. A method, system, and program product that enables the tracking of packets beyond the 64K sequential number without increasing the standard tracking array size would be a welcomed improvement. It would be further desirable if such tracking features could be dynamically completed while the transmission was in progress without requiring knowledge of the actual size of the file before transfer begins. These and other benefits are provided by the present invention.