Methods and systems disclosed herein relate generally to caching data. More specifically, the methods and systems disclosed herein related to a method by which spatially and temporally interactive streaming visual data of high density, such as, for example, but not limited to, video data, may be effectively cached in order to mitigate strain on network and I/O bandwidth.
Current methods for exploitation of spatially and temporally interactive streaming visual data typically involve three main components: The originating data set, which houses the partial or complete collection of the data to be accessed; the client application, which allows a user to view and navigate the available data via interactive query; and the retrieval algorithm, which processes the user's query in order to retrieve data from the originating data set. An interactive query is one which is constructed through a user's interaction with the client's spatial and temporal interface via actions such as continuous playing, seeking in time, panning, and zooming through some defined range of space and time. Each query will then specifically be composed of some bounded spatial range at a single point in time.
As data density increases, so does the bandwidth required to fulfill each query. Moreover, as the frequency of requests to the originating data increases, so does the aggregate latency by which the user receives the data. In the common case of the originating data set being housed remotely from the client, and in situations where multiple clients are viewing the same data, these bandwidth and latency requirements can quickly exacerbate network traffic and lag, which makes the interactive streaming data unreasonably difficult to view.
Client applications typically implement naïve caches that will keep recently retrieved data in memory or on disk to exploit temporal locality (the phenomenon that if a datum has been referenced, it is likely that it will again be referenced in the near future). In instances where the same query is made multiple times within a short time, the retrieval algorithm will bypass the originating data set for the local cache in order to fulfill the query. These caches may implement a Least Recently Used (LRU) policy in order to evict data when the cache gets filled. Slightly more effective caches may exploit spatial locality (the phenomenon that if a datum has been referenced recently, it is likely that nearby data may be referenced) to some degree for eviction policies.
Most retrieval algorithms will retrieve corresponding data to satisfy the user's query each time one is made, only occasionally having the opportunity of bypassing the originating data set with references to the simple cache described above. A more effective retrieval algorithm may prefetch data into the client's cache, guessing at future queries in order to minimize the aggregate latency. In the current state of the art, prefetching may be done using a Region-of-Interest (ROI) detector. However, implementations of these detectors are either crowd sourced, requiring many users to examine a relatively small range of the data, or employ a significant amount of preprocessing overhead to detect ROIs within the interactive streaming data's context. Though these detectors work well in certain situations, they are not considered as a general purpose solution due to their dependence on a smaller search space and customized detection algorithms.
What is needed is a method for effectively caching large amounts of data to mitigate the strain on network and I/O bandwidth.