Content Distribution Networks (CDNs) burst onto the digital scene in the late 90's to address the fact that the Internet was not designed to handle large transmissions of Web content over long distances especially when concentrating traffic at a single source. Network congestion and traffic bottlenecks, exacerbated by burgeoning payloads of Web traffic, degrades individual Web site performance, compromises network performance and hinders the transfer of information.
CDNs are means to offload some or all of the content (mainly static) delivery burden from the origin server, as shown in FIG. 1. A replica server, which delivers content on behalf of the origin server is typically called a CDN server. CDNs aim to reduce client perceived latency (e.g. web browsers), provide capacity management for the server and provide additional caching.
CDNs address the attendant problems by storing and serving content from many distributed locations rather than from a few centralized origin points. The theory is, by bringing the content to the user, or as close to the user as possible, network congestion is reduced if not eliminated. This is accomplished by using caching technology, wherein CDNs store replicas of content near users, rather than repeatedly transmitting identical versions of the content from an origin server. The result accelerates and improves the quality of content delivered to end users, while lowering network congestion and bandwidth costs for ISPs (Internet Service Providers).
There are primarily two ways to redirect requests to the CDN servers. One is a Domain Name System (DNS) redirection, wherein an authoritative DNS server is controlled by the CDN infrastructure and distributes the load to the various CDN servers depending on whatever policy is selected, such as round-robin, least loaded CDN server, geographical distance, etc. The other is URL rewriting, wherein the main page still comes from the origin server, but URLs for any embedded objects, such as images and clip art etc. are rewritten, are directed to any of the CDN servers of the system.
The wide scale adoption of peer-to-peer systems, utilized for the distribution of digital content (e.g. Napster and others), has further exacerbated the inherent problems of Web congestion. These systems have heretofore been adopted based on grass roots efforts for free file sharing of digital content. Much attention has been given to the use of P2P (peer-to-peer) technology for the mainstream distribution of digital content. When considering the implications of deploying such a system, issues to once again consider are availability of the network to potential end-users who make content requests (i.e. reach of the system), network transparency, ease of use, network security and privacy (including piracy protections), network quality of service (for example, speed of delivery and reliability), incentive structures for specific end-users who, for example, shared resources with the network and class structures based business rules such as user participation in the network.
Despite the many advances the prior art still has many deficiencies, problems and short-comings. For example, existing P2P systems require users to download a client application and become part of a community that shares content. Studies have shown that nodes in peer-to-peer systems are either mostly clients or mostly servers of content, even though the architecture and implementation of such systems allow nodes to act in either of these two roles. We will refer to nodes in this invention, also called edge machines, as any computing device that has installed a distributed software application for the primary purpose of making content requests from the network or delivering content to other nodes. Furthermore, these applications require significant storage and computing resources on the end-user's machine. These architecture approaches burdened peers acting exclusively as clients, sometimes referred to as Requestors of content, by forcing them to have all of the machinery required for them to operate as distributors as well.
Additionally, many users, such as gamers, prefer to use their corporate or university connections to download large content blocks based on the available bandwidth provided by their employer. Systems that require a resource intensive application or that attempt to open outgoing connections to unknown hosts may be severely limited with respect to these corporate installations. By requiring a resource intensive application that uses unknown services for connections, severe reach limitations of the network are realized through the removal of those applications by IT departments.
Previous systems have been decentralized with respect to introducing and serving content, allowing end-users to control this process. This has resulted in content owners not adopting these systems due to security and copyright concerns.
The use of extensive computing resources forcing users to join specific communities to access content in combination with the active filtering by IT departments have severely restricted reach and created barriers to entry for existing software based content distribution systems. For example, a typical P2P system employing digital content distribution may require a user to install a “thick client” application (in this example 16 MB in size). Additionally, running of the application could require memory resources of 1 MB, even when the application is idle. The system, in this example, establishes outgoing connections, which may be made over private ports, to serve content to other users. Furthermore, content that resides in the network can be introduced by end-users and can be controlled to some level by end-users.
Since most PDAs, cell phones, pagers and other similar devices (i.e. “thin” client devices) have limited memory capacity, the above decentralization technique is not a viable option.
What is needed is a decentralization method and technique capable of being utilized by thin client device architecture, wherein a download manager can be installed on a thin client device, with the download manager being device independent and yet where the Client can still communicate with and take advantage of the entire distributed network. What is further needed is a method to eliminate barriers to end-user participating in the network.
Thus, it is desirable to provide a content distribution network system and method and it is to this end that the present invention is directed.
Another shortcoming of the prior art involves the distributed control associated with existing systems, which can result in very poor quality of service to end users. FIG. 2 is a Prior Art schematic diagram of a P2P system where an inefficient method of cache management is utilized for data chunks. For example, the performance of a distributed CDN is greatly affected by content availability on edge machines. Without proper caching, high latencies serving files to Requesters and high capacity utilization transferring content to the edge machines for caching can adversely impact performance. Today, existing P2P systems use distributed decision-making. The edge machines self select what content to cache and Requesters randomly decide from which edge machines to request content. P2P edge machines utilize well known algorithms to select what content to cache based on which content has been most recently requested. When a request is received for content the edge machine is not caching, the edge machine gets a copy of that content and removes from its cache the less recently accessed content. A deficiency of this distributed decision-making technique is populating caches based on partial system need rather than entire system need (i.e. edge machines perspective rather than systemic need). As a result, edge machines may not cache some content while over-caching other content. This decision is made autonomously without consideration of system need. Edge machines may also thrash content as they constantly alter their caches. By design the edge machines are unaware of each other and consequently, can't organize caches to optimally satisfy Requesters.
In existing P2P systems, Requesters first contact a broker to obtain an entry point into the CDN. The broker randomly notifies the Requester the specific edge machines to contact. This results in Requesters requesting content from edge machines without certainty the edge machine has the content. If the edge machine does not have the content, the Requester waits for the edge machine to retrieve the content into its cache, thus turning the edge machine into a Requester. During the time the edge machine retrieves the content, the original Requester is forced to wait. This chain of turning edge machines into Requesters may occur many times as edge machines hunt for the content. A result is infrequently requested content may get replicated onto more edge machines than is desirable and Requesters encounter delays getting the desired content. Alternatively, Requesters can contact another edge machine rather than waiting for the contacted edge machine to retrieve the missing content. This may avoid the delay of an edge machine retrieving the missing content. However, it does not prevent an over replication of the content as each edge machine contacted while the Requester searches for the content will add the missing content to its cache. Even if Requesters find an edge machine with the desired content, the edge machine may be busy serving content to another Requester. This forces both Requesters to share the serving capacity of the Requester or forces the second Requester to continue hunting for the content. Therefore, current systems are characterized by inefficient utilization of available cache for content storage, higher wait times to access content based on availability, and poor capacity utilization.
What is needed is a Centrally managed edge machine cache and intelligent routing of Requesters that results in improved performance by reducing latency and reducing bandwidth utilization.
Another set of deficiencies, problems and short-comings with the prior art involves not associating user behaviors to rewards provided by the network, or network service provider. For example, existing P2P systems require users to download a client application and become part of a community that shares content. Studies have shown that nodes in peer-to-peer systems are either mostly clients or mostly servers (of files), even though the architecture and implementation of such systems allow nodes to act in either of these two roles. Furthermore, these applications require significant storage and computing resources on the end-user's machine. These architecture approaches burdened peers acting exclusively as clients by forcing them to have all of the machinery required for them to operate as distributors as well.
Additionally, current P2P systems force users to join a community, with new rules, relationships, and risks. These risks can include the introduction of viruses from untrusted users introducing content, liability associated with storing and serving content without the owners permission, and storing and serving inappropriate content. This severely limits the reach of such systems, when compared to standard download methods such as FTP and HTTP sites. The “value” provided to joining a community has heretofore been access to content in the community, usually free content that many times violates copyright law and accounts for significant losses to content owners/distributors. Previous systems have also been decentralized with respect to introducing and serving content, allowing end-users to control this process. This has resulted in content owners not adopting these systems due to security and copyright concerns.
Finally, current systems don't control class of service based on user behavior. For example, a user who provides resources to the network, be it storage or fees paid for access to exclusive content, does not get rewarded in current systems with a better class of service, such as improved file delivery speed or priority access to the network.
The use of extensive computing resources, lack of reward systems for specific user behavior, risks associated with open systems that allow users to introduce and control content, forcing users to join specific communities to access content in combination with ceding control over content to users have severely restricted reach and created barriers to entry for existing software based Content Delivery Systems. For example, an existing P2P system used for digital Content Delivery may require a user to install an application.
A marked improvement over the Prior Art would utilize a new method and user interface, whereby a new value system is employed, in which a flexible rules based system is utilized to provide specific class of service and associated benefits to designated classes of end users, and where specific promotion and branding could be employed to influence user behavior into a specific class. In this new system, content owners would authorize the introduction of content into the network and to be available to specific classes of users. Another variation of this rules based system would be to specify class of service based on specific content, such that the normal business rules for class of service were overridden and all users would obtain the class of service for the specific content (i.e. a publisher wants the entire community to get a new demo as quickly as possible).
In one scenario, the user is provided value based on joining the community, which could include for example sharing resources or subscribing and paying a fee, and having access to content in the community. The concerns that have been raised by content owners include the unauthorized access and sharing of content within these communities. In addition, even though the user may or may not become a server of data within the community (based on bandwidth connection, preference, etc.), the entire application is loaded on their computer and becomes burdensome. The benefits of current systems include massive scalability, network optimization based on distributing content closer to endpoints, and higher fault tolerance through a high degree of redundancy.
What is needed is a distributed system method and technique where a value system, for example one in which a user is provided enhanced benefits such as improved bandwidth for either joining the community or paying a subscription fee for exclusive content in the community, can be employed. An integral feature of this value system could be based on the functional separation of the server and client functions as described above and which has the added value of expanding the overall network reach.
Still another set of deficiencies, problems and short-comings with the prior art include the inability of existing systems to employ use policies that maximize network utilization. For example, the distributor software, which runs on the edge machines in the network, stores files or file components and uploads these to client machines on request. This distributor function uses disk space, CPU, and bandwidth resources on the computers that run the distributor software. Corporations, government institutions and educational institutions may have use policies that prevent a user's machine from acting as a server and uploading content to others or that specifically limit the amount of data that can be uploaded in a time period. This can result in users being prevented from running the P2P distributor software or from participating in the P2P network.
What is needed is a governor system allows the setting of an upload limit to be done in a number of ways.
Also, the prior art still has many deficiencies, problems and short-comings with respect to properly securing content within the network. For example, splitting a content file into chunks provides the benefit that two or more Requesters can download the same chunk from the same edge machine but avoid simultaneously using the edge machines by being separated by time. Also, splitting the file into chunks means that the system can take advantage of the asymmetrical nature of broadband connections and perform multiple downloads to increase realized bandwidth. However, in this scenario, encryption of content is generally only useful when the sender and receiver use a unique (private) key. This key is generally only shared between the sender and the receiver to reduce the possibility of key theft. With a Distributed CDN, where data chunks are stored statically on edge machines, there is an inability to uniquely encrypt content for each end-user, thus the network becomes insecure. This insecurity results from the sharing of the same private keys on multiple edge machines, which may be not considered secure. The operator could resolve this problem by hosting a centralized system and performing dynamic encryption, but this means the content would need to be served centrally, which defeats the benefits of a distributed content delivery network.
What is needed for security in a Distributed CDN is a system which provides a high level of security while not overburdening the system resources such as processing and bandwidth.
Still another set of deficiencies, problems and short-comings in the prior art involves the inefficiency with digitally updating software on end users computers. For example, software distribution strategies over networks generally recognize that updates to application software are typically sourced from one of a small number of central servers to potentially a large number of clients. Such an approach creates load and networking congestion problems for popular software content as many clients connect simultaneously to these central servers. FIG. 3 depicts a Prior Art Domain Name System for receiving software updates from a central FTP or HTTP server. This can result in a very low quality of service (low speed, denial of access) to the end-user. One response to this problem is building out many dispersed data centers to serve the content, but this has the problem of being very expensive and complex.
What is needed is a distributed method and technique whereby application software is updated via a distributed CDN, that provides massive scalability, redundancy, and reach.
An innovative area for utilizing the technology is wireless content. Wireless networks today are transforming from dial-up based voice services with one direction data to full IP based interactive packet networks. In addition, mobile devices are becoming multi-function, with better processing and storage capability, offering the user a richer experience and added functionality. The integration of voice, data, and mixed media applications is now becoming a reality. As higher bandwidth, always on networks and rich, multi-function mobile devices emerge, a new group of end-user applications will be supported.
Examples of emerging mobile applications include video and games. Wireless games and music are among the top applications driving traffic for wireless data services. Despite the many advances the prior art still has many deficiencies, problems and short-comings. For example, with existing devices and dial-up based networks, most digital media downloads consist of very small files such as ring tones and graphics (less than 100 k files) to customize displays. Emerging wireless networks, such as GPRS and 3G provide higher bandwidth IP packet capabilities, which enable the download of richer digital content including games and music. Wireless operators are building both “walled garden” portals and also teaming with third parties to provide distribution and publishing facilities for games, music and other rich media.
As these networks are introduced to the public, the demand for downloads of digital content will increase dramatically. The performance of the network will be of paramount importance, both from a standpoint of customer satisfaction for the end-user, but also to provide an efficient distribution mechanism to support the underlying business models. Creative models will need to be developed, where the user continues to pay for usage but the total cost of the value added services is compelling to the end-user.
Currently, wireless operators are initiating roaming agreements that allow for customers of one carrier to use the packet based network of another carrier. The plan initially to support download of digital media is for the customer of operator A, who is roaming on operator B, to download digital content via operator A's portal or partner. The network path is therefore from operator A to operator B to the Internet and back. This is an extremely inefficient and costly method, especially when load is placed on the network from digital content downloads.
FIG. 4 is a schematic diagram of a Prior Art Wireless Content Delivery System whereby content is routed across networks to the service provider of origin. Suppose a customer of Operator A, based in German, is roaming onto Operator B's network in Switzerland. Under the current architecture, for a customer to access content based on preferences determined with Operator A, the customer downloads the file, accessing Operator B's wireless network, backhaul data network, data center, inter-exchange network between Operator A and Operator B, Operator A's data center and Internet access to the content provider (and content). In this scenario, latencies would be very high, probability of download failures increases as the number of disparate networks/equipment is accessed, download quality of service would depend on both Operator's networks, and the cost of sending data cross border would be higher.
What is needed is a decentralization method and technique capable of being utilized by thin client device architecture, wherein a download manager can be installed on a thin client device, with the download manager being device independent. What is further needed is a method to route clients to content which is located in proximity to the client, even when roaming on partner networks.