1. Field of the Invention
The present invention relates to-computer hardware and software, and more particularly to a system and method for chaining news servers for Usenet operations.
2. Description of the Prior Art
Usenet is a worldwide collaboration network of servers that support newsgroups. There are many thousands of newsgroups, each covering a particular topic of interest to users. Each server administrator can decide which newsgroups to support, usually based on the requests of the local users who wish to read and contribute material to particular groups. Postings to newsgroups can consist of any form of digital data, but are referenced herein generically asxe2x80x9carticles.xe2x80x9d
In contrast to the World Wide Web, Usenet is a forum that allows many people to collaborate with many of their peers in the same interest group. Instead of a single user downloading a website, a Usenet participant can observe a thread of discussion from many different people. When a Usenet participant posts material, that article will similarly be accessible to each of the participants in the newsgroup.
Usenet is also distinguishable from e-mail transmissions and chat rooms. Newsgroups allow readers to choose their topics of interest. Unwanted articles and messages do not clutter mail in-boxes. As people respond to messages, those responses get added below the prior messages to form a stream of discussion (also called axe2x80x9cthreadxe2x80x9d of discussion).
When a new article is posted on a newsgroup, the originating server sends a copy of the article to each networked server that has requested xe2x80x9cnewsfeedsxe2x80x9d in that particular newsgroup. Since Usenet is a worldwide network, it is possible that a new article could be copied thousands of times and migrate to distant servers. Usenet is generally transmitted using a protocol called NNTP (Network News Transfer Protocol). Special newsreaders are required to post, distribute, and retrieve Usenet articles. Newsreaders are widely available in freeware, shareware, and commercial versions, and are included in certain versions of Microsoft Internet Explorer and Netscape Navigator.
Internet Service Providers (ISPs) have been under popular pressure to provide access to Usenet. The volume of news feeds to servers has increased dramatically, resulting in difficult technological challenges to maintain appropriate levels of service to users. High performance servers are now required, along with innovative algorithms, in order to handle the volume of articles that are posted on the various newsgroups.
One of the difficult challenges relates to a system and method for archiving articles. With the explosive growth of Usenet, relatively large amounts of disk space are required to store new articles generated for a particular newsgroup. The archival period for some newsgroups may be a period of only a few days, since large volumes of postings are submitted on a daily basis. Users will be unable to retrieve older articles because such articles will have been purged from the server to make room for more recent submissions.
Usenet servers are limited by the amount of computer storage that is available for Usenet articles. The growth of Usenet has made it difficult to offer both a wide variety of newsgroups and a long archival period for each newsgroup. Instead, each server administrator must be selective about the content that will be offered to users.
Due to cost constraints, many Internet Service Providers (ISPs) prefer to install local servers with a limited storage capacity. Most users will be interested in the latest articles within the most popular newsgroups which are stored on the local servers. Inevitably, certain users of the small-capacity servers will desire access to content that is not available locally, but might otherwise be available on another large-capacity server. The present invention overcomes the limitations of small-capacity servers by making them appear to users as if the server contained the union of articles stored locally and on the large-capacity server. As used herein, the terms union and intersection are given their mathematical meanings as applied to set theory.
The present invention is a system and method for connecting a high storage capacity server to several lower storage capacity servers. The lower storage capacity servers will appear to have the archival capacity of the union of themselves and the high storage capacity server. In this manner, an Internet Service Provider can locate multiple small servers in proximity to its users, while advertising the availability of a larger number of articles than could otherwise be stored on the local servers.
As will be described further herein, the chaining system of the present invention allows storage devices to be staged throughout a network so that the most requested Usenet articles are available on fast, local servers, thereby freeing larger storage devices to hold greater numbers of less-popular articles. The larger storage devices may be shared among several local servers. The overall system performance is enhanced because the load on storage devices can be advantageously balanced, with the fastest (and typically smallest) storage devices absorbing the greatest load, while the slower (and typically larger) storage devices have a reduced load and offer a larger selection of less-popular news articles. The net result for the user is a system with a large selection of articles in which most of the requests will be served quickly from local storage devices.
According to the present invention, the numerous small storage capacity servers can store a small archive of news articles comprising the latest, or most popular news articles. However, those users desiring an older or less popular article will have seamless access to the larger archive. The end result is a higher-performance system capable of serving the needs of numerous users. Subscribers to ISPs will gain the benefit of having access to a larger number of news articles.
In one example, a local news server may support an archive of one week of the latest posting in a particular newsgroup. The larger-capacity remote server may store two months of articles from the same news group. A user connected to the local server will have access to the entire range of articles from both servers. Since the latest posting is typically the most popular among users, the local server will bear the heaviest load, but its corresponding speed will still provide good service to users. In contrast, the remote server is likely to have a slower response time to requests due to its larger size, but the server will have a lighter load because of the less popular nature of its content. The overall system performance will be improved over earlier Usenet architectures in terms of speed and quality of articles available to users.
In another example, a local news server may only have enough storage capacity to support one dozen active newsgroups. Many thousands of newsgroups exist within Usenet, and there may be local subscribers to the server that wish to access less popular newsgroups that are not stored locally on the server. Under the present invention, the user will have access to the newsgroups that are stored on the larger server.
The present invention involves chaining two servers together for the purpose of offering a larger variety of newsgroups and articles to end users. The two servers can be described as a master machine and slave machine. The slave machine may contain a relatively small library of Usenet articles, whereas the master machine has a relatively large library. Under the present invention, the slave machine can advertise the availability of articles contained in the union of itself and the master machine. The user may or may not be informed that most of the content is stored remotely on the master machine. In the event that the user requests an article that is not resident locally on the slave, the article is retrieved from the master machine.
Each newsgroup has a name, and within each newsgroup the individual articles are indexed. If the newsgroup is stored wholly within a single server, then the indexing function can be performed locally. If the newsgroup is stored on more than one server, synchronization of the numbers is required so that like articles can be similarly identified on each server. Number synchronization can be accomplished in several ways, but in general the result is that each new article in a newsgroup is assigned a unique and permanent identifier. Typically, these identifiers are numerical and sequential. However, any form of unique identification scheme can be equivalently employed.
Chaining requires number synchronization whenever the same newsgroups exist on both the master server and the slave server. The sequence of synchronization numbers, or other index identifier, on each server is herein called axe2x80x9crange.xe2x80x9d Chaining is accomplished through on-the-fly merges of ranges. For example, a newsgroup may have the most recent thousand articles available on a slave machine, whereas the next older nine thousand articles will be stored on the master machine. To the end user, it appears that the slave machine has ten-thousand articles.
Synchronization can be accomplished in several ways. For example, all news articles in a particular newsgroup could be routed through a master server where they are indexed. The master may retain certain articles and pass other articles to designated slave machines. Alternatively, servers that are chained together can be fed by the same source so that the servers can coordinate the retention range for each newsgroup. In yet another alternative, news articles may be fed to slave machines, wherein older articles in the slave are sent for archival to the master, with the slave retaining only the most recent articles for local availability.
The present invention can also be practiced by linking a small, local server to more than one larger server. Such a configuration is herein called split chaining. The advantage of split chaining is that larger archives of news articles can be offered to end users. For example, a large archive of several news groups can stored on a first server, while another large archive of different news groups can be stored on a second server. A small local server can be chained to both larger servers to provide access to the sum of the archives of the two larger servers. Split chaining can also be accomplished when two larger servers have overlapping newsgroups, however it will be important to have an article synchronization scheme to keep track of the location of various articles.