Information about the availability of resources in a communication device is currently used to optimize exchange of digital resources between communication devices. It is referred to below as cache information.
This is for example the case for caching web proxies that work cooperatively. Each caching web proxy caches a set of digital resources in cache memories and shares its cache information with the other caching web proxies. Then, when a caching web proxy receives a request for a digital resource not stored in its cache memories, it can use the information about the availability of resources in the other caching web proxies to select one of them to handle the request, avoiding requesting each of the other caching web proxies.
Cache information about the availability of resources in a communication device is generally implemented through a Bloom filter.
A Bloom filter is a compact data structure for a probabilistic representation of a set of elements. In the example above, the elements are the digital resources hosted by the communication device.
Bloom filter theory is for example disclosed in “Less hashing, same performance: Building a better bloom filter” (A. Kirsch and M. Mitzenmacher, Random Structures & Algorithms 33, no. 2 (2008), pp. 187-218).
A Bloom filter representing a set enables to check whether an element is member of the set. For an element member of the set, the Bloom filter will always return the correct result, i.e. that the element is a member of the set. That means that false negatives are not possible. However, for an element not member of the set, the Bloom filter may wrongly return, with low probability, that the element is a member of the set. That means that false positives are possible. The probability for a Bloom filter to return a false positive is its error rate.
A Bloom filter of size m is composed of k hash functions having values in the range [0,m−1] and of an array of m Boolean values. m should be greater than k, generally m>>k. For example m=18 and k=3.
To store an element, such as a digital resource, in the Bloom filter, the k hash functions are applied to it to obtain k hash values v1, . . . , vk. Then, for each value vi, the corresponding Boolean value in the array is set to ‘true’, i.e. the Boolean value having the index vi in the array: in other words, the Boolean value at index v1 in the array is set to ‘true’; the Boolean value at index v2 in the array is set to ‘true’; and so on.
Another element may be added to the Bloom filter by again setting the Boolean values at index vi (calculated for that specific element) to ‘true’. Of course, some of these Boolean values may already be set to ‘true’ due to the previous addition of others elements.
Based on a Bloom filter so constructed, the presence of an element in the set represented by the Bloom filter can be tested. To do so, the k hash functions are applied to the element to test, to obtain k hash values v1, . . . , vk. Then, for each value vi, the corresponding Boolean value in the array (Boolean value at index vi) is retrieved.
If all Boolean values at index vi (i=1 . . . k) have the value ‘true’, then the Bloom filter returns that the element to test belongs to the set it represents. However, there is a probability that this result is false: a Bloom filter can return a false positive. This is because the Boolean values at index vi (i=1 . . . k) for the tested element can have been set to ‘true’ to represent several elements members of the set. In general, with a correctly configured Bloom filter, the rate or probability of false positive is quite low.
Otherwise (if any one of these Boolean values is not ‘true’), the Bloom filter returns that the element to test does not belong to the set its represents. In such a case, the result is always correct: a Bloom filter never returns a false negative. This is because if at least one of the Boolean values is not ‘true’, then the tested element has not been added to the Bloom filter.
An issue regards the probability of false positive, i.e. of error that an element is detected to be in the set whereas it is not in the set.
A mathematical study of Bloom filters shows that the probability of false positive for a Bloom filter storing n elements is roughly equal to:p≈(1−e−kn/m)k 
This formula enables computation of the optimal number k of hash functions to use for minimizing this value when n and m are given:
  k  =            m      n        ⁢    ln    ⁢                  ⁢    2  
Using this value, the probability of false positive can be estimated as:p≈2−k 
For example, when using 10 bits per element (n elements represented by n*10 bits or Boolean values), the number of hash function should be chosen close to:k=10 ln 2≈6.93
Using k=7 hash functions leads to a false-positive rate of:p≈2−7≈0.008
This means that with 10 bits per element, the false-positive error rate is below 1%.
As introduced above, Bloom filters are used by cooperative caching web proxies to share their caching content. The publication “Cache digests” (A. Rousskov and D. Wessels, Computer Networks and ISDN Systems 30, no. 22-23 (1998), pp. 2155-2168) gives an overview of this sharing of caching content.
Each caching web proxy creates a cache digest, i.e. hash values resulting from hash functions, which represents the content of its cache memories using a Bloom filter.
The caching web proxies share their cache digest with the other cooperative caching web proxies.
On receiving a request for a resource not in its cache memories, a caching web proxy uses the cache digest of the other caching web proxies to check whether any of them has that requested resource in its cache memories. If so, the request is sent to the caching web proxy caching the requested resource.
In this way, the number of requests between proxies is greatly reduced, saving network bandwidth.
A false positive happens with a probability of p. A false positive means that a caching web proxy sends a request to one of the other caching web proxies for a resource that the latter does not store in its cache memories.
Even if false positives consume bandwidth, the overall result is still a large decrease of bandwidth usage.
Publication U.S. Pat. No. 7,937,428 discloses a system and a method for generating and using a dynamic Bloom filter. In detail, several cascaded Bloom filters are used to represent a set of elements, wherein each time a new element is added in the set, it is also added to a current Bloom filter of the several Bloom filters. As a Bloom filter false-positive error rate grows with the number of elements it represent, a Bloom filter is considered full when the number of elements it represents reaches a predefined limit, which corresponds to its error rate reaching a corresponding limit. When the current Bloom filter is full, an additional Bloom filter is created and becomes the current Bloom filter for new elements to add to the set. Given that several Bloom filters then coexist, any request for an element involves checking the current Bloom filter and the previous Bloom filters.
The inventors wished to apply the above Bloom-filter-based sharing of cache information to the “Push” model of communication, in particular to the SPDY protocol.
On the Web, the usual paradigm is the “Pull” model whereby a client device, such as a web browser, requests a main digital resource, such as an HTML web page, from a web server device and receives the requested main resource in response.
The client device then parses the received HTML web page to discover which secondary resources referenced therein (e.g. images, scripts, etc.) are required for fully rendering the web page. It then requests them from the server device and upon receiving the requested secondary resources, it displays the entire web page.
However, new technologies, such as SPDY (standing for “SPeeDY”) improving the well-known HTTP protocol (standing for “Hypertext Transfer Protocol”) for sending web pages over the Internet, have emerged which also provide the above-mentioned “Push” model.
SPDY makes it possible for the server device to push resources to the client device, on its own initiative, over the same network connection as initiated by the original client request. This makes it possible for the server device to push the secondary resources referenced in the main resource requested by the client device, before the latter discovers they are needed for the rendering of the main resource.
Thanks to the SPDY push of resources, web pages can be loaded faster and their rendering be obtained faster.
Sharing cache information of the requesting client device is also an issue to avoid bandwidth waste, by enabling the server device to reduce the resources to push: the server will only push resources not yet in the client cache memories.
It is therefore wished to provide information about the client device's cache memories, using for example a Bloom-filter-based representation, to the server device.
For example, when requesting a main resource from a given web server device, the client device can search its cache memories for all the resources already received, in particular those received from that given web server.
The client device then creates a Bloom filter for representing those resources and sends the created Bloom filter array within the request to the web server device.
From the received Bloom filter array, the web server device is then aware of which resources are already available in the client device and which ones need to be sent to it, with a degree of certainty limited by the probability of false positive.
In the above example of a main requested resource and secondary resources referenced therein, the server device can determine which secondary resources are not yet available in the client device using the Bloom filter array and decide to send those not yet available resources to the client device.
To give an exhaustive explanation, a false positive means that the server device will not push a resource (such as a secondary resource) the client device does not have in its cache memories.
This transposition of the Bloom-filter-based sharing of cache information to SPDY to optimize push of secondary resources raises difficulties.
First, the cache memories of the client device may be quite large and may store thousands of digital resources.
In this context, providing an efficient Bloom filter (i.e. with an acceptable false positive rate) to the server device for these thousands of resources would make the message including such Bloom filter too large for network capacities. Unacceptable delay could therefore be introduced.
This is all the more prejudicial since such cache information could advantageously be inserted in an existing message, such as a request for a main resource, to reduce overhead.
Also, these resources may differently impact the rendering of main resources on which they depend. The supplying of cache information to the server device should therefore be conducted with the view of providing an efficient push of secondary resources from the server device to the client device, so as to improve the client responsiveness. That is the apparent responsiveness through a quick first rendering of a requested main resource and some of its secondary resources, and the true responsiveness through an exact full rendering once the main resource and all its secondary resources are fully loaded.
The present invention has been devised to address at least one of the foregoing concerns, in particular to notify efficiently a remote communication device with cache information of a local communication device, so as to ensure acceptable rendering of a main resource by the local communication device where the rendering indirectly depends on the cache information. This is for example the case with the push of secondary resources as introduced above.
The present invention may apply to the Push model of SPDY but also to any case where cache information representing the availability of resources in cache is generated.