It is well known that a big data technology relates to all aspects including data collection, transmission, processing, and management. Data transmission means that big data is transferred from one organization to another and is usually transmitted by using the File Transfer Protocol (FTP). The FTP is based on the Transmission Control Protocol (TCP) to implement data transmission. Throughputs are often affected by many factors when the TCP is used to transmit data, such as a packet loss rate and a round trip time (RTT). For 100 GB data, if a transmission rate is 1.32 Mbps (megabit per second), seven days are required for completing transmission; or if a transmission rate is 0.31 Mbps, 30 days are required for completing transmission.
Currently, for low TCP transmission performance of big data, there are many new technologies to improve transmission efficiency of big data. For example, the grid File Transfer Protocol (Grid FTP) is used. This technology is used to establish a number of TCP streams between a transmit end and a receive end to transmit data. The transmit end distributes the data to the TCP streams, and then the receive end combines the data from the TCP streams. When a packet loss caused by congestion occurs on one or more TCP streams, the transmit end halves a congestion window based on an acknowledgement (ACK) message returned from the receive end, thereby resulting in a decrease in throughputs. However, because there are the TCP streams to simultaneously transmit data, impact on average throughputs is relatively small when data transmission fails on a small quantity of TCP streams, thereby ensuring TCP transmission performance.
However, although data transmission throughputs of big data can be ensured using the foregoing solution, a number of TCP connections need to be established. If 20 TCP streams are used for calculation, and each TCP stream occupies 20 M memory, 400 M memory is required for transmitting one group of data. If a number of groups of data are simultaneously transmitted, and there are memory overheads of another application of a system, system memory becomes a bottleneck of data transmission.