Commercial websites and applications often provide recommendations to their users. Such recommendations can include content related to the current webpage or application accessed by the user (e.g., a related news story), a product related to a product in a user's shopping cart (e.g., a recommendation for socks if the user is buying shoes), or a promotion/advertisement related to the current webpage accessed by the user and/or a product in a user's shopping cart. Product and offer recommendations can also be injected into email communications sent to users. Recommendations that are personalized, relevant, and appropriate can help increase user traffic, sales, and/or revenue and, therefore, they are important components of commercial websites and applications.
In order to generate a relevant recommendation, a recommendation engine takes into account one or more factors or data regarding the user and/or the content of the current webpage accessed by the user. Generally, the recommendation engine uses real-time information as well as historical information accumulated over large periods of time to generate the recommendation. Such a recommendation engine requires memory and processing power, which may vary depending on the volume of user traffic, the number of products or offers in the merchant's catalog, and the amount of historical data available. The recommendation engine also requires network bandwidth to serve the recommendation without an undesired latency delay.
FIG. 1 is a block diagram of a system 10 for providing recommendations to a client 100 according to the prior art. As a user is viewing a webpage on the client 100 that is downloaded from a host 110, the host 110 transmits a request for a recommendation to a recommendation backend 120. The recommendation backend 120 includes a routing server 130 and a plurality of recommendation servers 140a, 140b, 140c, 140n. The recommendation servers 140a, 140b, 140c, 140n each include a logic processor 150a, 150b, 150c, 150n, and a memory 160a, 160b, 160c, 160n, respectively. Although the processors may vary between recommendation servers, each memory is a mirror image of the other memories. Each memory contains all the information that the respective server needs to respond to a recommendation request.
The routing server 130 receives the recommendation request and determines which recommendation server 140a, 140b, 140c, 140n to send the recommendation request to. The routing server 130 can take various factors into consideration to determine the appropriate recommendation server 140a, 140b, 140c, 140n to handle the recommendation request, such as the available capacity and the geographic location of each recommendation server. The routing server 130 then transmits the recommendation request to the appropriate recommendation server 140a to process the request. Upon receiving the recommendation request, the processor 150a queries the memory 160a for data relevant to the request. Such data can include personal information regarding the user, information regarding the webpage or website accessed by the user, and/or a list of products related to a product in the user's shopping cart. The processor 150a then applies logic (e.g., a recommendation algorithm) to the data and returns a recommendation to the routing server 130, which then transmits the recommendation to client 100 via the host 110.
In response to the volume of recommendation requests (e.g., due to increased or decreased website traffic), the routing server 130 can adjust the number of recommendation servers 140a, 140b, 140c, 140n upwards or downwards. If the routing server 130 needs to deploy a new recommendation server 140n in response to an increased volume of recommendation requests, the routing server 130 must first cause an existing recommendation server 140c to copy its memory 160c to the memory 160n of the new server 140n. Since the memory 160c is very large, it may take several hours or more to copy memory 160c to memory 160n. Thus, it may take several hours or more to deploy the new recommendation server 140n. Also, some of the bandwidth of server 140c is diverted to bring new server 140n online. Since the backend 120 is overcapacity (or near overcapacity) until new server 140n is brought online, the recommendation requests will take longer to process, which results in an undesired latency. For this reason, new recommendation servers are often deployed before their capacity is truly needed to allow for adequate time to copy the data to the new server. As a result, recommendation backends 120 generally operate at over capacity, which results in undesired costs and inefficiencies.