Content distribution systems have been developed to enable data such as software updates, critical patches, and multimedia content to be distributed to nodes in a network. Typically these systems comprised many servers which were placed in the network, with nodes connecting directly to one of the servers to download the required file. However, such systems are constrained by the connection bandwidth to the servers and require considerable investment to increase the capacity of the system. Consequently, content distribution systems have been developed which rely on a fully distributed architecture with nodes in the network participating in the distribution process. Such systems may be referred to as peer-to-peer or peer-assisted content distribution systems. In such a system, the server may divide the file to be distributed into a number of blocks and provide these blocks to nodes in the network. As soon as a node has received one or more blocks, the node can act as a source of the received blocks for other nodes whilst concurrently receiving further blocks until they have received all the blocks of the file. Unless nodes are aware of which blocks are both required and held by other nodes in the network, such systems can experience problems including rare blocks and network bottlenecks.
More recently, cooperative content distribution systems have been developed to avoid the rare block problem and the requirement for a node to be aware of all the other nodes in the system. Such systems use network coding, which means that each node in the system generates and transmits encoded blocks of information, these newly encoded blocks being a linear combination of all the blocks currently held by the particular node. This compares to earlier systems where the encoding of the blocks only occurred at the server.
The use of network coding can be described with reference to FIG. 1, which shows the flow of blocks between a server 102 and two clients (or nodes), client A 104 and client B 106. Initially all the blocks, B1-Bn, are held only by the server and not by any nodes. When client A contacts the server to get a block, the server produces an encoded block E1 which is a linear combination of all the blocks in the file, such that:E1=α1B1+α2B2+ . . . +αnBn where αi are random coefficients. In order that the block size does not increase in size, these operations take place within a finite field, typically GF(216). The server then transmits to client A both the newly encoded block E1 and the coefficient vector (αi). Client A may also receive a second encoded block E2 from the server, created using a second set of random coefficients βi. When client A needs to transmit a block to client B, client A creates a third encoded block, E3 from a linear combination of E1 and E2 using random coefficients ωi.
When network coding is used as described above, a client can recover the original file after receiving n blocks that are linearly independent from each other, in a process similar to solving a set of linear equations. If the coefficients are chosen at random by each client, a client will be unlikely to receive a block which is not of use to that client. However, to further check this, clients may transmit the coefficient vector to a receiving client (client B in the example of FIG. 1) in advance of the block itself. The receiving client can then check whether the resultant block would provide it with any new information and only request the downloading of the block if it will be of use to the receiving client.
As described above, in order for a node to be able to generate a newly encoded block it must read all the blocks it has received into memory. This is processor intensive and introduces delays. Additionally when decoding, a node is faced by a very complex problem of decoding the encoded blocks, which is again processor intensive and time consuming. Read/write operations into and out of memory at the node are particularly time consuming.