1. Technical Field
Example embodiments of the present application relate in general to the field of data transfer over a network, and more particularly to data retrieval using tags and routing rules to optimize the data retrieved and the response time for the data retrieval by decreasing the latency of the response and by increasing the efficacy of the dynamic content delivery.
2. Related Art
The Hypertext Transfer Protocol (HTTP) is an application protocol for distributed, collaborative, hypermedia information systems. As the foundation of data communication for the World Wide Web, the HTTP protocol is designed to improve or enable communications between clients and servers using intermediate network elements, such as proxy servers. The standards development of HTTP was coordinated by the Internet Engineering Task Force (IETF) and the World Wide Web Consortium (W3C) and culminated in the publication of a series of Requests for Comments (RFCs).
Currently, when a user requests information from the Internet, the user typically submits an HTTP request message, using a client server, to a web server. The web server, which typically stores content, returns a response message to the client server in the form of a hypertext markup language (HTML) file, cascading style sheets (CSS), JavaScripts (JS), images, or another relevant data format. The web server may also perform other functions on behalf of the user based on the HTTP request message. The response message from the web server contains the completion status information concerning the HTTP request message and may contain the content requested by the client server in the HTTP message body.
To improve system performance, an intermediate server, e.g. a local or a proxy server, between the requesting client server and the web server may be used to cache responses from the web server and return subsequent requests for the same content directly to the client server. When a message request from the client server is received, the intermediate server checks with the web server to see if a cache entry associated with the content residing in the intermediate server is still valid. By checking with the web server, the intermediate server validates its cache entry associated with the message request.
The use of an entity tag (ETag) is specified as part of the HTTP protocol. The ETag can be used for validation of the cache entry. An ETag is an opaque identifier, typically assigned by a web server, wherein the ETag is unique to the computer generating the ETag. The ETag is assigned to a specific version of a resource found at a uniform resource locator (URL) location. Thus, a different computer generating an ETag for the same version of the same resource does not produce the same ETag. If the resource content at the URL changes, a new and different ETag is assigned for association with that resource content. Used in this manner, ETags can be quickly compared and used to determine if versions of a resource, located on different computers, are the same or are different.
The use of ETags in the HTTP header is optional as defined in version 1.1 of the HTTP specification. See, for example, RFC-2616 section 14, (https://tools.ietf.org/html/rfc2616). In the HTTP, ETags typically are used for comparing two or more entities from the same requested resource content. HTTP version 1.1 uses entity tags in the ETag, If-Match, If-None-Match, and if-Range header fields of the HTTP header. ETags are one mechanism that HTTP provides for cache validation and allow a client server to make conditional requests for files. The use of ETags allows caches to be more efficient by saving bandwidth. By using ETags, the web server is not required to send the full amount of the content requested in a response if the requested content has not changed since the last time the content was transmitted to the client server.
Typically, the web server assigns an ETag to the requested file and returns the file along with the corresponding ETag value to the requesting server. The ETag value may be placed in the HTTP ETag header field. The requesting client server or intermediate server may then cache the file along with the corresponding ETag. When a client server requests the same file as before, the client server may send a message request for the content and also the ETag associated with the requested content. The ETag from the client server is typically placed in the If-None-Match HTTP header field. On this subsequent request for information, the intermediate server and/or web server may compare the received client server's ETag with the ETag associated with the current version of the file residing on the intermediate and/or web server. If the ETag values match, i.e. the content of the file has not changed, the intermediate server and/or web server may send back a very short response with an HTTP “not modified” status message instead of returning the file in a response. The status message tells the client server that a local or cached version of the file is current and can be used. This method of communication saves bandwidth that would otherwise be required by sending the file from the web server to the client server.
ETags, however, are unique to a file and to the server which generated the file. If the ETags associated with the requested file do not match and/or if the server cannot locate the requested digital content, an HTTP ‘not found’ message is returned to the client server. Locating content related to the file using ETags on another server is problematic in that the ETag is unique to the file and to the computer generating the ETag.
Thus, another method is required to efficiently locate and provide a requested file and/or content related to the requested file across alternate servers. The alternate method for retrieving digital content over the web and/or between servers still needs save the bandwidth that would otherwise be required to send the file and/or content related to the requested file, assuming that the content has been previously transmitted to the client server. The alternate method should also verify that the file and/or content related to the requested file has not changed or is still relevant since the last occurrence of the content request.