Technical Field
This application relates generally to methods and systems for modifying web pages to enhance their performance.
Brief Description of the Related Art
Web pages are complicated entities, made up of HyperText Markup Language (HTML), as well as other technologies, such as Cascading Style Sheets (CSS), JavaScript, Flash, and many more. Web pages can be thought of as programs executed by a browser or client, which is capable of executing software code in the abovementioned languages and technologies. Though it is generally transparent to end-users, web pages are often generated upon request, created by running dedicated software on a server when a user request is received. Such dedicated software is called a web application, and it typically uses technologies such as J2EE, PHP, ASP.NET and others.
A web page can be thought of as the software code provided or served as a response to a request for a particular and unique URI (universal resource identifier) or web address, or pointer thereto such as HTML, XHTML or different versions thereof. This software code is used by a web client to render or display a page for viewing.
One implication of the complexity of web pages is that there are many ways to achieve the same goal. Two web pages can look the same and function the same way (or at least similarly) for a given client, but their actual underlying content may be very different.
Even when different implementations result in the same or similar interface presented to a user, they may differ greatly in many different aspects. For example, one page may render much faster than the other; one page may expose a security flaw while the other does not; one page can be successfully loaded in multiple different browsers, while the other may only work in one kind of browser, for example.
As is known in the art, performance-enhancing changes (often referred to as performance optimizations) to web pages are sometimes performed by manipulating the web page after it is generated, using a proxy. A proxy may be realized as a software application able to modify incoming and outgoing communication with the web server. A proxy may be implemented in various ways, including the provision of a separate server machine that traffic to a web server goes through, or of a software proxy deployed as a web-server add-on through which internet traffic is passed. A content delivery network (CDN) may employ a distributed set of proxy servers operated and managed by a service provider. The service provider typically provides the content delivery service on behalf of third parties who designate their content to be delivered to end-users via the CDN. Typically, content providers offload their content delivery by aliasing (e.g., by a DNS CNAME) given content provider domains or sub-domains to domains that are managed by the service provider's authoritative domain name service, which directs end user client machines to one of the CDN's proxy servers to obtain the content provider's content more reliably and efficiently.
Making modifications in a proxy is an alternative to modifying the web application that generates the web page, and can provide several benefits, including lower cost and more flexibility.
In the last few years, there have been examples of proxy-based systems that perform not only the transformation, but also attempt to analyze the page and transform it based on that analysis, in order to enhance the performance of that page.
One known performance enhancement technique is sometimes referred to as resource consolidation. Resource consolidation generally involves combining multiple resources in a given web page into one consolidated resource.
For example, the proxy might several cascading style sheet (CSS) files referenced in a given HTML file into one CSS file. If the HTML referenced five external CSS files (e.g., with five separate URIs), combining them into one reference would eliminate four requests when loading the page, and the combined CSS file, when encoded using gzip or other compression, would likely compress more efficiently than compressing the files separately. Hence, a proxy solution may attempt to identify the CSS files in a given page, create a combined file, and modify the HTML to reference that combined CSS file instead. Other kinds of files, such as JavaScript files, can also be consolidated with this technique.
However, when a client browser downloads a consolidated resource, none of that resource is evaluated and processed by the browser until the entire resource has arrived. If the consolidated resource is relatively large, it may take a while until the first portion of it (e.g., the portion corresponding to a first JavaScript file that was consolidated into a larger consolidated file) actually gets processed by the browser. This can make consolidation actually degrade performance, for if the resources had not been consolidated but rather retrieved separately, the browser would have started processing them as they arrived, which in some cases would result in better performance.
Hence, there is a need for improved techniques for consolidating web page resources. The teachings herein address this need and offer other advantages and functionality that will become clear in view of this disclosure.