Content delivery networks (CDNs) allow resources to be delivered to client devices quicker and cheaper than traditional client/server technology. A CDN includes multiple servers that are typically geographically distributed that can store and serve files of an origin server. A CDN server may be physically closer to the requesting client than the origin server. This has several advantages including reducing the time for the client to receive the resource, reducing the bandwidth of the origin server, and reducing other processing resources of the origin server.
CDNs typically operate as either a “push” CDN or a “pull” CDN. In a push CDN, the resources are first “pushed” to the CDN servers before they are requested. When requested, the CDN server can respond to the request from its storage instead of querying the origin server. In a “pull” CDN, the resources are received at the server dynamically when requested. For instance, upon the first client making a request for a resource that is not available at the CDN server, that CDN server typically “pulls” the resource from the origin server (it typically sends a request for the resource to the origin server and receives the resource from the origin server in response). Subsequent requests for the resource can be retrieved by the CDN through its cache instead of querying the origin server.
Many CDN servers use a multi-process architecture with many CPUs on the same machine. These processes (sometimes referred as “worker” processes) minimally coordinate with each other so that they can largely run independently taking advantage of the many CPUs of the server. However, if there is an uncached resource, the first request that fetches the resource is handled by a first worker process and subsequent requests that fetch the resource may be handled by different worker processes. Different worker processes conventionally do not share data buffers.
The CDN server can stream the response received from the origin server to the initial requesting client instead of waiting for the entire resource to be received. However, if another request is received for the same resource before the entire resource is received, the CDN server would conventionally either transmit an additional request to the origin server or set a lock in cache to wait until the resource is fully received or timeout (if timed out, the CDN server would make an additional request to the origin server). If the requested resource is large (e.g., greater than 500 MB), the cache lock timeout can be easily hit thereby causing more requests to be sent to the origin server which takes up further origin bandwidth further exacerbating the problem and also increasing latency to the requesting client.