Until recently, corporate data and content within global or other large organizations were distributed by replicating and distributing such data and content using centralized content repositories. That is, the data was globally distributed, but made available within a location of geographical area by using a central server that was responsible for serving the content to clients located within the specified area.
The advent of peer-to-peer (P2P) computing has changed this approach. The accent has shifted from storing content in, and serving from, centralized servers to storing and serving at least some of the content from the client-side. In this P2P model, the content provider manages the content in a local client, and shares the content with anyone who accesses the content. In this model, content creation, storage and security dwells on the client side.
There are several advantages to this P2P approach. By shifting the responsibility for content to the client side, server-side management of diverse resources can be vastly reduced. Server managers need not be responsible for the integrity of the content. Problems arising from centralized distribution of content could be averted.
There are at least three architectural approaches to peer-to-peer resource sharing systems. P2P with centralized control, pure P2P with no centralized control and a hybrid approach that incorporates some of the aspects of the other two.
One example of P2P with a centralized controller is a system referred to as Napster. The Napster system uses a central server to maintain a list of connected clients. Every client connects to the central server, which scans the clients' disks for shared resources and maintains directories and indexes of the shared resources. A client searching for a resource performs the search on the maintained directories and indexes of the central server. Once a client knows where to find the resources that is it seeking (i.e. which client has the files it is searching for), it makes a direct connection to the appropriate client and transfers the resources.
Napster is not web-based, and does not run in a browser. It is a stand-alone application that runs on each individual client, and uses TCP/IP for its data-communication and data transfers. Since Napster depends on a central server that acts as a collector and regulator of information, the clients are not guaranteed anonymity. The Napster system is also vulnerable if the central server fails.
A good example of pure P2P with no centralized control is a system referred to as Gnutella. Gnutella is a generic term used to identify those P2P systems that use the Gnutella protocol. There is no single interpretation of what the protocol is, actually. However, there are certain common elements that manifest in Gnutella-based systems. Chief among those is that Gnutella does away with the central server. In this system, each client continuously keeps track of other clients by pinging known clients in the system. Distributed searches are propagated from one client to its immediate neighbors in ever-increasing circles until answers are found, or the search times out. Search responses are propagated back to the searcher in the same manner.
Like Napster, Gnutella-based systems are also not web-based, and run as applications in client environments. Gnutella is a truly anonymous resource sharing system. No server is used to facilitate searches, clients must establish ad-hock peer information. The searcher does not know the identity of the responder, and vice-versa. Thus there are no authentication or authorization checks, trust is implicitly assumed.
A serious problem in Gnutella-based systems is their reputation for being unreliable. Lacking a central server that keeps track of which client is connected, and which is not, there is no way for a client to know if all its neighbors are alive and connected. This leads to less than reliable performance.
The third approach to P2P systems is referred to as Web Mk. This is more of an approach than an actual product, and is described in a Gartner Report on the emergence of P2P computing entitled The Emergence of Distributed Content Management and Peer-to-Peer Content Networks, January 2001. The report is hereby incorporated by reference. This is a web-based approach that uses web servers and web browsers. The web browsers would be configurable by users and would integrate resource-sharing features. The servers will maintain multiple indexes and allow access to different forms of data. This type of system would use software agents or Bots to provide services such as extraction and consolidation of multiple resources, chat facilities, and notifications of changes. Search requests could be stored in the server and set to run in real-time or as a batch process, and alert the appropriate clients of the results.
What is needed is a system that adapts the advantages of the P2P network while resolving disadvantages of current P2P systems. What is needed is a P2P network that takes advantage of the reduced central server requirements of a pure P2P network without sacrificing the efficiencies of the central server. What is further needed is a P2P network that provides secure access and control to client resources without the requirement of a central server.