In recent years, a communication network such as the Internet is connected to various information processing apparatuses, for example, a personal computer (PC), a large computer, a server and other communication machines, and contents such as video, image data, audio data or various programs are transferred or various processed data is transferred, among respective network-connected machines. The types of contents exchanged via a network are changing from text and still image and the like to multimedia contents such as moving image and audio.
Attention has been drawn to a large scale storage system which distributively stores some data at a number of information processing terminals interconnected by a network. In such a distributed storage system, a server which records and manages data transmits data to information processing terminals and other servers through multicasting to make the data be recorded in local recording media installed in the information processing terminals and other servers.
In this case, in order to fetch data on demand, a large amount of data is required to be recorded in recording media. For example, if a movie has a data capacity of about 2 giga bytes per one film and five hundred films of such video data are to be recorded, a capacity of 1 tera bytes is necessary.
In the case that data is supplied through streaming, if a server supplies data to a client requesting for the data through unicast, a protocol requesting data re-transmission such as an acknowledge (ACK) signal of TCP/IP is used in order to perform transmission of free of errors.
However, since this approach places a large load on a server side, even if one high performance server is used, services can be provided to only several hundred clients in a current situation. Even if a protocol not using ACK such as UDP/IP is used, the number of serviceable clients is about several thousand clients. As described above, if data is supplied through streaming, the cost on the server side increases and the number of clients is limited.
In order to deal with this, a method has been proposed recently, which transmits data to a plurality of clients without requesting for data re-transmission, by using FEC (Forward Error Correction) as multicast techniques. With this method, a server repetitively transmits a stream through multicast, and a client picks up necessary signals from this stream and decodes and reproduces the picked-up data.
In this method is used when five hundred films of video data of a movie having 2 giga bytes per film are transmitted in ten minutes, a transmission band of about 14.7 giga bit/sec becomes necessary. If the video data of the same amount is transmitted in one minute, a transmission band of about 147 giga bit/sec becomes necessary. Although these are theoretical values, a server affordable to such a capacity and transmission mode requires a very large cost, and even if such a server is realized, it is not practical. Although there is a system for distributively recording data at a plurality of hosts, if this system is to be realized, it is necessary that a plurality of servers manage a huge amount of data, so that the number of processes for data management and data communication increases.
Peer-to-Peer (P2P: Peer-to-Peer) network technologies have been developed and used recently, which provide direct communication processes among information processing apparatuses. In the configuration of a P2P network, a server for concentratedly performing processes is not installed, but each information processing apparatus each network client has as a resource communicates with each other via a network to allow each network client to share the resources, the information processing apparatus including various machines, for example, a PC, a portable terminal, a PDA, a portable phone, a disc apparatus as a storage means or a printer connected to a communication machine.
Peer-to-Peer (P2P: Peer-to-Peer) network technologies are considered to be used first in APPN (Advanced Peer to Peer Networking) advocated by IBM United States. By using this network, it is not necessary to install a giant distribution server which is required to perform contents distribution in a conventional client-serve type network, and many user can use contents distributed to the resource possessed by each network client, allowing distributed storage and distribution of a large capacity of contents.
The Peer-to-Peer (P2P: Peer-to-Peer) network has two network types: “Pure Peer-to-Peer (P2P: Peer-to-Peer) network” and “Hybrid Peer-to-Peer (P2P: Peer-to-Peer) network”.
The Pure Peer-to-Peer (P2P: Peer-to-Peer) network is a network type that each constituent element (Peer) of the system has equal function•roll and performs equal communication. Typical services using this network are Gnuterlla, for example. The Hybrid Peer-to-Peer (P2P: Peer-to-Peer) network is a network type that uses a control server for smoothing interaction between respective constituent elements (Peer) of the system, in addition to the Pure Peer-to-Peer (P2P: Peer-to-Peer) network. Typical services using this network are Napster, for example.
In the Hybrid Peer-to-Peer (P2P: Peer-to-Peer) system, typically Napster, when a network-connected terminal acquires contents, first a central server searches contents resources, in accordance with the search information the terminal accesses the node (another network-connected terminal) which possesses the resource, and acquires the contents. This system has the disadvantage that resource information of all nodes is required to be registered in the central server and that searches are concentrated upon the central server.
To avoid this, a system has been proposed in which processes such as resource search are distributively executed by a plurality of apparatuses. With this process distributed system, process execution judgment apparatuses are managed, for example, by disposing the apparatuses in a tree relation, and in accordance with the management information, processes such as resource search are distributively executed by a plurality of apparatuses. This system also has some problems that as the number of process execution apparatuses becomes large, for example, several million, the amount of tree structure management information increases, the number of process commands for informing an execution command to a plurality of processing apparatuses increased, tree consistency is required to be guaranteed, and the like. Since the judgment process by a plurality of process execution judgment apparatuses is necessary, there is a problem that a process delay occurs.
There is a system mitigating these weak points in which all commands are sent to all network-connected nodes and each node is made to judge whether the received process command is executed at the node. This system is the Pure Peer-to-Peer (Pure P2P) system, typically Gnutella. Being different from the Hybrid Peer-to-Peer (Hybrid P2P) system, this system has the configuration that it does not have a central server for executing a resource search process, but a search request is directly transmitted and received at each node to perform resource search, and the hit terminal is asked to perform the process request such as contents transmission.
The configuration that if a search command is transferred, all nodes or nodes as many as possible are made to perform a search through routing of such as a tree structure and a network structure, is effective also for the Pure Peer-to-Peer (Pure P2P) system, typically Gnutella. However, this system has also a drawback that a load is placed upon a transmission route because each node executes a command transfer process for a process command not executed at the node.
For example, in order to search all network-connected nodes and make a process request arrive at all nodes, complicated routing management is necessary. On the other hand, if a node search of a best effort type is executed, it cannot be guaranteed that a command is passed to all nodes, and a necessary resource cannot be found in some cases. If communication for node search is frequently performed, there arises a problem of network congestion.
There are several data transmission modes. The first mode is a mode that all data is acquired from a single node. With this mode, although the data can be acquired reliably, a pre-search becomes necessary to judge whether the data exists. There is a problem that a load is concentrated upon the node having contents. Moreover, if a connection node becomes down, there arises a problem that data reproduction cannot continue. Data download using a single node is effective if a very few reproduction instruction apparatuses use data, i.e., if the apparatuses for receiving data from the node and reproducing it use the data. This mode has a smaller transmission loss such as packet duplication.
The second mode is a mode that a single node transfers data through carousel transmission. The carousel transmission is a data repetitive transmission mode and is named after a carousel. If multicast is used with this mode, data can be transmitted to a large number of reproduction instruction apparatuses. However, since data cannot be acquired at an arbitrary timing, in order to eliminate a data delay or the like there is no other method than to shorten a wait time by increasing the number of repetitions per unit time. Although this carousel mode is an efficient mode if there are a number of concurrently reproducing persons, it cannot satisfy both a real time performance and reproduction at an arbitrary timing. This mode also has a smaller transmission loss such as packet duplication.
The third mode is a scheme, called chaining. The chaining is a scheme that if there is a node which received data immediately before from another node, data is requested not from the other node, but the node received data immediately before is accessed and the data is made transferred from this node. Although the chaining does not ensures that the data exists surely, it can realize efficient data transmission without an excessive network load if a few reproduction instruction apparatuses use particular data. However, if a number of reproduction apparatuses are connected, the carousel transmission is more efficient.
A mode of retaining locally a number of caches is conceivable as the fourth mode. Namely, the reproduction execution apparatus caches all data. However, if such a cache mode is used, it is not practical unless the capacity of a local storage apparatus is fairly large. Although the network load during reproduction is actually zero, there is a problem that the network load during distribution is largest in all the modes.