1. Field of the Invention
The present application relates generally to data storage and retrieval, and more particularly, to systems and methods for network acceleration and efficient indexing for caching file systems.
2. Description of Related Art
While low-cost laptops may soon improve computer access for the developing world, their widespread deployment will increase the demands on local networking infrastructure. Locally caching static Web content can alleviate some of this demand, but this approach has limits on its effectiveness, especially in smaller environments.
One option for augmenting Web caches is to use wide area network (WAN) accelerators, devices that compress redundant traffic passing between them, using custom protocols. These devices are application-independent, and can improve the latency and effective bandwidth seen by clients using slow network links. In first-world environments, these devices are commonly used to accelerate communications between a central office and branch offices connected via low-speed WAN links.
WAN accelerators are deployed near edge routers, and work by transparently intercepting and modifying traffic to destinations with other WAN accelerators. Traffic to destinations without WAN accelerators is passed through the device unmodified, preserving transparency. For intercepted traffic, the accelerators typically break the data stream into smaller chunks, store these chunks at each accelerator, and then replace future instances of this data with reference to the cached chunks. By passing references to the chunks rather than the full data, the accelerator compresses the data stream.
Another option for augmenting network caches is to improve the cache storage engine. Large enterprises and ISPs particularly benefit from network caches because they can amortize their cost and management over larger user populations. Cache storage system design has been shaped by this class of users, leading to design decisions that favor first-world usage scenarios. However, because disk size has been growing faster than RAM sizes, it is now much cheaper to buy terabytes of disk than a machine capable of indexing that much storage, since most low-end servers have lower memory limits. This disk/RAM linkage makes existing cache storage systems problematic for developing world use, where it may be very desirable to have terabytes of cheap storage (available for less than US $100/TB) attached to cheap, low-power machines. However, if indexing a terabyte of storage requires 10 GB of RAM (typical for current proxy caches), then these deployments will require server-class machines, with their associated costs and infrastructure. Worse, this memory is dedicated for use by a single service, making it difficult to deploy consolidated multi-purpose servers. This situation is especially unfortunate, since bandwidth in developing regions is often more expensive, both in relative and absolute currency, than it is in the US and Europe.