The present invention relates to data compression and, more particularly, to a system and method for partial data compression and data transfer.
As connectivity to the World Wide Web grows, Internet traffic and transaction volume continue to rise as. According to various sources (such as the Computer Industry Almanac) active Internet users are expected to exceed 850 million worldwide by the end of 2005. Companies, portals and hosting providers must continuously face the challenge of expanding infrastructure to manage the increase in demand for content and services, and to maintain quality of service. Businesses are dedicating more of their IT budgets to Internet related services (bandwidth and infrastructure). According to other sources (such as the Cahners In-Stat Group) Internet spending will grow to over 24% of Information Technology budgets in the U.S. or over $200 billion in 2004.
Managing quality of service, which is driven primarily by consumer demand and required by competition, cannot be solved with infrastructure and content alone. The problem is that as the number of Internet users is increasing, and more consumers are looking for rich content (especially broadband users) more megabits of data must be delivered to end-users. This forces companies to rely on new services and technologies that optimize current infrastructure investments. Such services and technologies include media compression, network caching, and innovative pricing models for hardware and connectivity. Media compression and pricing have been “squeezed” to give maximum return on investment, but more recently, these solutions have not been proven successful in sustaining profitability or cost savings per megabit delivered.
As companies turned to network caching solutions, they found that over time, the cost per megabit delivered actually increased with little or no return on investment. Most solutions currently available focus on the end user and are not designed to reduce operating costs for providers and hosting companies. These companies have been forced to optimize cost pricing models and use the latest media compression algorithms. In addition, web site designers use fewer media and more text when implementing web pages. Eventually, the same problems will occur with text dominated web sites as the number of connected users increase over time and technology infrastructure becomes more difficult to manage due to size, distribution, and operating costs.
Current solutions for solving capacity and performance issues fall into two main categories—content caching and compression. These solutions focus on the end user by solving or masking “last mile” issues by reducing bandwidth consumption or distributing the load oil web servers across the network to reduce latency. There are serious pitfalls to these two approaches: neither reduces costs or increase revenues for most companies and neither offers tangible benefits to the end user. In fact, the overall cost of operations usually increases with little or no demonstrable return on investment.
Network caching has proven itself to be effective in managing flash crowding and latency for content providers that do not have rapidly changing, or dynamic web sites. However, caching requires external hardware and bandwidth that is marked up and resold to the content provider much the same way data centers operate. The only relief content providers get is not having to manage larger data centers. In effect, a portion of the hosting is out-sourced, leading to higher long-term costs. In an outsourced model, data centers are widely distributed across the Internet backbone. Dynamic sites do not benefit because remote servers require continual updating. The only major benefit is to the end user who can download pages from the edge of the network a little faster than going back to the original source.
Dynamic “on-the-fly” compression reduces throughput requirements and decreases download times for end users. However, the content provider incurs additional cost with this approach, especially with high volume sites. A problem with on-the-fly compression is that web servers consume additional CPU and memory resources to compress the content “on-the-fly,” leaving fewer resources available to manage connections, transactions, and data transfer. If there are less server resources available, more servers must be installed to maintain original capacity. This drives operating costs higher, offsetting any savings in bandwidth. Typically, companies that manage high volume web sites will disable this feature due to the tremendous strain on server hardware and the costs of offsetting the strain with additional hardware.
Other methods include pre-compression of the HTML and XML files and partial file transfers. Pre-compression of web page files before hosting them on a web server is not practical. Most sites are database driven and have to dynamically create web pages. Partial File Transfers is a recent technology developed to deliver only the changes in a web page. This is made possible by the HTTP1.1 standard supporting resumeable downloads. This may sound ideal, however, it also consumes additional server resources and dramatically decreases infrastructure capacity.
Several attempts at solving some of these problems have been made, but business models supporting these services have yet to prove themselves successful or profitable. The source of the failure is that these companies target end user issues and not enterprise's issues. A new solution is needed to reduce costs and increase the quality of service for these companies, and as a consequence, the end user will benefit.
Therefore, it is desirable for the present invention to overcome the conventional problems and limitations associated with content caching and compression.