1. The Field of the Invention
The present invention relates to accessing content stored on a server. More specifically, the present invention relates to methods and computer program products for using a front-end server to access content stored on one or more back-end servers.
2. The Prior State of the Art
Day by day, the amount of content available through electronic sources grows to ever increasing proportions. To match this growth in content, hardware components, including disk drives, servers, and communication links, must be upgraded and expanded constantly. In many circumstances, hardware and software alike are strained nearly to the limits of their capabilities. Although new hardware provides additional capacity to meet the storage and processing demands users may impose, the addition of new hardware often disrupts how users interact with the content the users typically access. Typically, some existing content will be moved to the new hardware, and therefore, a change in how users access the existing content after the move may be required.
For example, private content, such as a single user""s email, calendar, and tasks, usually is stored on a single server, but is subject to being moved. Often, the email accounts on a server operating near capacity will be divided between two servers. Moving mailboxes presents a problem because users who were accessing one server must now access another. If users explicitly had requested their email from one server, they now must remember to request email explicitly from a different server. For example, if a user accesses email using hypertext transfer protocol xe2x80x9cHTTPxe2x80x9d, the uniform resource locator xe2x80x9cURLxe2x80x9d must reflect the name of the new server. Alternatively, a user""s email software may need to be reconfigured so that it reflects the new location of email storage.
A further problem email users may experience when their mailbox is moved from one server to another is related to caching. Often, mailbox data is cached at the user""s local machine. The cache is linked to the user""s mailbox so that changes in the mailbox data may be reflected in the cache. Likewise, if a user works offline for a time, changes made locally are queued up so that when the user is reconnected, the local changes will be reflected at the user""s mailbox. This synchronization of the user""s local machine and mailbox storage is a relatively quick process because only changes are transmitted, rather than the entire content of the mailbox. The increased performance from caching is particularly noticeable over slow communication links. However, moving a user""s mailbox may invalidate the user""s local cache, and require a complete exchange of mailbox data to reconcile differences between the local cache and server mailbox data. A similar problem exists for any content cached on a user""s local machine that is subject to being moved from one server to another, such as calendaring data, task data, etc.
In order to isolate users from the movement of content between servers, some systems provide a front-end server and a back-end server. The front-end server receives requests for content and performs the content formatting required for sending the content to the client system making the content request. The front-end server requests content from the back-end server as the content is needed. However, these prior art systems suffer from at least two significant problems. First, applications must be custom written to take advantage of the division between front-end processing and back-end content storage. Second, the front-end server includes content that is vulnerable to attack because the front-end server is directly accessible by client systems making requests for content. If the front-end server is comprised, all content stored or cached at the front-end server is compromised. Furthermore, by being directly accessible, the front-end servers are subject to denial-of-service style attacks that are relatively common on the Internet.
The management of storage and processing hardware is compounded by the number of users accessing the content a server provides. Some content may be specific to a particular user, such as email, calendars, and tasks. Other content is shared among a group of users who collaborate on various projects or on day-to-day work assignments. Even without collaboration, content may be accessed by a large number of individual users. For example, contact data, standard forms, discussion groups, scheduling applications, etc., all may have a substantial number of individual users.
In other prior art implementations, front-end and back-end servers also may be useful in providing access to public content (i.e., content that is not specific to any particular user, like software or discussion groups). The content is replicated across a number of back-end servers and the front-end server routes requests for the content to the back-end servers as the requests are received. One significant problem with this approach is that the same content must be stored at each of the back-end servers. No allowance is made for customized or selective replication.
For example, front-end routing may include a geographical component so that users requesting content in one area are routed to a particular back-end server for that area. If a discussion group is dedicated to a local topic that is of little or no relevance outside of the area, the discussion group content nevertheless is replicated to all back-end servers. Moreover, because content is replicated across all back-end servers, the content is assumed to be available at all back-end servers. As a result, if a problem occurs at one back-end server, an error will be reported to the client that the content is unavailable, even though the request could have been routed to another back-end server after the error condition occurred. Therefore, the prior art lacks methods and computer program products for effectively using a front-end server to access content stored on one or more back-end servers.
These and other problems with the prior art are overcome by the present invention, which is directed toward using a front-end server to access content stored on one or more back-end servers. The front-end server acts as an agent or proxy for the client. No content is stored at the front-end server, but the front-end server receives requests for content that are generated by a client system, locates a back-end server that stores the content, routes the request to that back-end server, and returns the requested content back to the client system.
The front-end server has access to a global catalog that dynamically tracks the availability of content at the back-end servers. By examining the client system request and the global catalog, the front-end server identifies one or more back-end servers that appear to be capable of fulfilling the client system""s request for content. For request of private content that is stored by only one back-end server, the front-end server identifies the one back-end server that contains the content. Public content (content that is not specifically directed to any particular user) may be stored on multiple back-end servers. The front-end server identifies a list of the back-end servers when public content is stored on multiple servers. Then, using an authentication token associated with authentication credentials supplied by the client system, the front-end selects one of the servers in the list.
In one implementation, the front-end selects a back-end server from the list by using the authentication token as a key to a hash operation performed on the list. This provides at for a given client system making a request for the same content each time the content is requested. As a result, any user-specific status information that is stored with the requested content will be available to the client system for each request. Furthermore, as back-end servers are added or removed from service, the hash operation automatically distributes subsequent requests over the back-end servers that are available at the time of the request. For some users, any user-specific status information will be lost because they will be directed to a different back-end server than they have typically accessed.
The front-end server also checks the validity of requests for content. Invalid requests are rejected to prevent denial of service attacks directed at the back-end servers. In a denial of service attack, a rogue computer swamps the system under attack with invalid requests. Certain invalid requests can require a substantial amount of processing. Eventually, the system under attack spends all of it time attempting to process the invalid requests and is unable to perform any other operations. Unable to keep up with the demands placed on it, the system under attack ultimately crashes with potentially severe consequences. Because the front-end server does not store any content, a denial of service attack presents a relatively minor threat.
Furthermore, the front-end server provides an additional level of security to the back-end servers. A hacker gaining access to the front-end server does not gain access to any content and therefore may not make any malicious changes to content. This is in contrast to prior art systems where front-end servers store content and are vulnerable to having the content modified by someone breaking into the front-end servers.
The present invention provides clients with a single front-end server for accessing content, where the content may move from one back-end server to another. Because clients access back-end servers through the front-end server, the front-end server hides any change in the location of content, providing users with a consistent server name for requesting content. The consistent name for requesting content is a significant benefit to client systems because it allows for changes in content location without invalidating content that is cached locally on the client system. Furthermore, users are not required to change requests for content to match the back-end server location where the content is stored.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the invention. The features and advantages of the invention may be realized and obtained by means of the instruments and combinations particularly pointed out in the appended claims. These and other features of the present invention will become more fully apparent from the following description and appended claims, or may be learned by the practice of the invention as set forth hereinafter.