Reverse proxy is a mechanism used to represent one or more secure content servers (i.e., a web servers) to outside clients, preventing direct, unmonitored access to data within the content servers from outside. Typically, the content servers are located behind a firewall and are part of a secure internal network. Reverse proxy may be used outside the firewall to represent the secure content servers to external clients.
For example, if a content server containing sensitive information, such as a database of credit card numbers, must remain secure, then a proxy server in reverse proxy mode may be set up outside of the firewall protecting the content server. In this case, the proxy server appears as the content server to the client requesting information from the actual content server. Conventionally, when a client makes a request to the content server, the request is directed to the proxy server. The proxy server then sends the client's request through a specific passage in the firewall to the content server. Subsequently, the content server passes the result through the passage back to the proxy. At this stage, the proxy modifies the request headers and sends the retrieved information to the client, as if the proxy were the actual content server. Further, if the content server returns an error message, the proxy server can intercept the message and change any uniform resource locators (URLs) listed in the headers before sending the message to the client. This prevents external clients from receiving redirection URLs to the internal content server.
Reverse proxies are typically set up using one of following two basic architectures. The first architecture employs a one-to-one mapping between the internal network IP addresses and some external network attribute. For example, a one-to-one mapping may be set up between the IP address of the content server and externally published port numbers. Alternate implementations may map internal IP addresses to external IP addresses, URL paths, etc. In some instances, the aforementioned method for reverse proxying requires manual intervention when a server is added to the internal network.
An alternate architecture used for reverse proxy involves a URL rewriter. The URL rewriter includes the functionality to intercept, identify, and replace each URL of the page (i.e., information requested) served to the client. The original URL is rewritten such that the any subsequent request from the client is returned back to the proxy server rather than to the internal network. Typically, the URL rewriter is self-learning, i.e., the URL rewriter is capable of learning new internal network URLs without manual intervention.