The concept of serverless computing is fast gaining popularity in the field of cloud computing. The name serverless computing is somewhat of a misnomer, as servers are still required. Instead, the name originated due to the fact that server management and capacity planning decisions can be entirely hidden or otherwise abstracted from end users and consumers of compute power. Advantageously, end users can simply provide User Defined Functions (UDFs) or other compute jobs to a serverless computing environment, at which point the necessary computational resources are dynamically allocated to execute the UDFs or compute jobs—at no point in time is the user required to manage, or even be aware of, any underlying hardware or software resources.
In serverless computing environments, UDFs are often received from multiple end users. The order in which UDFs are executed can depend on numerous factors, ranging from aspects of the UDF itself (computational requirements, efficiency requirements, etc.) to service aspects offered by the operator of the serverless computing environment (maximum wait time guarantees, pay for priority, etc.). However the UDFs are ordered, they are typically held in one or more task queues prior to being executed. A main disadvantage of serverless computing lies in the manner in which tasks or UDFs are provisioned.
A UDF typically requires some amount of data in order to execute. For example, a UDF might require a dataset to analyze or operate upon, or a UDF might require several libraries in order to run to completion. In a serverless computing environment, it is very often the case that the UDF will not be scheduled for execution at the same node as which this requisite data resides. Instead, one or more nodes are selected to execute the UDF, and the requisite data must first be moved to these nodes before the UDF can begin to execute, thereby introducing what can be a significant amount of latency to the overall UDF computation process. Accordingly, improvements are needed.