Large scale, multicomputer datacenters host large quantities of data. In response to user queries to manipulate the large quantities of data, the datacenter may distribute a “compute request” to one of a number of compute resources, the compute request a communication from the datacenter for a particular compute resource to perform processing data as stipulated in a user query. Multi-computer data centers rely on load balancers to route queries and distribute load across available computer resources. Generic load balancers lack domain knowledge about queries and cannot effectively interpret them to identify similarity, which results in not taking full advantage of caching functionality.
Depending on the complexity of a query, or the size of the one or more data sets, the amount of processing involved can vary significantly. Also, certain compute requests may contain user defined code, which may introduce risks when performing the compute request. In addition, historical information associated with prior executions of compute requests may not be collected or considered for routing compute requests. Accordingly, back-end computational environments can be inefficient, requiring more resources than necessary for performing queries on large data sets, and may not be effectively mitigating known risks.