The present invention relates to the field of data communications and, more particularly, to a system and method for accelerating data communication associated with or as a result of the action of browsing web pages or otherwise accessing content through the world wide web, Internet or other global or local network.
The Internet is an exceedingly popular medium for data communication between computers. The Internet is a hierarchy of many computer networks, all of which are interconnected by various types of server computers. Some of the server computers interconnected through the Internet also provide database housing or storage of a plurality of web pages or web content. These web pages may be retrieved by users, also referred to as surfers, operating computers that are connected to the Internet and running browser applications or other similar applications that request, receive and render markup language content or other Internet accessible content. Popular browser applications that are available on the market include, but are not limited to, Openwave Systems Inc. or Opera Mobile Browser (a trademark of Opera Software ASA), Microsoft Internet Explorer (a trademark of Microsoft), Firefox Web Browser, etc.
Many current web pages are defined by markup language (ML) files, including but not limited to HTML, XML, WML, SGML, HDML etc. HTML is an acronym for Hyper Text Markup Language, XML is an acronym for Extensible Markup Language and WML is an acronym for Wireless Markup Language. SGML is an acronym for Standardized General Markup Language. HDML is an acronym for Handheld Device Markup Language. It should be noted that the terms “HTML”, “XML”, “SGML”, “HDML” and “WML” may be used interchangeably herein. Henceforth, the description of different embodiments of the present invention may use the terms ‘ML’ and/or ‘HTML’ as a representative term for any of the various forms of markup languages unless specifically limited to a particular markup language.
A ML file contains various commands, instructions, data and references (links) that together, define the overall appearance of a web page. Once the ML file is fetched and rendered using a browser or other similar application, the intended web page can be displayed on a display device. Common HTML files may comprise information that is relevant to the web page and is needed for rendering the requested web page. This information may include, but is not limited or required to include, a style sheet, text, images, Java Scripts, links to a style sheet, links to Java Scripts, links to additional objects, links to other web pages, etc. A web page can be composed from a plurality of objects or segments of the web page that together comprises the web page. The objects or various parts of a web page can be distributed over a plurality of servers.
Often times, an HTML file comprises links to the above-described objects rather than the objects themselves. This technique is widely utilized, thus most HTML files can include basic text and links to style sheets, JavaScripts, images, and other objects and not the style sheet or the object itself, etc. Objects that are used by the browser itself are referred as browser's objects. The links to the browser's objects are fetched automatically by the browser application during parsing of the page—these links are referred to as browser links. Links like, but not limited to, links to Java scripts, style sheets, and images can be referred as browser links. Other links are displayed as a portion of the rendered page and may appear as highlighted, underlined, colored or otherwise differentiated text or graphic elements. These links are available to be selected by the surfer, and as such, are referred as surfer links or user links. As an example, a surfer link, when selected may result in navigating to another web page.
While surfing the World Wide Web (WWW), a surfer (a user), utilizing a browser equipped computer or endpoint device, may send an HTTP (Hyper Text Transfer Protocol) request to a web server. In response, the web server may send an HTML file that defines the requested web page. In the disclosure of different embodiments of the present invention, the term computer or endpoint represents any type of data communication device including, but not limited to, a laptop computer, a PDA, a cellular unit, a notebook computer, a personal computer, etc. Upon receiving the HTML file, the browser begins parsing the HTML file. When a browser link is found, the browser may stop or pause the parsing process and search its cache for an object that was previously fetched from the same link. If such object does not exist, then the browser establishes a new connection according to the browser link or uses an existing connection, waits to get the object, parses the object and accordingly, may continue parsing the HTML file. In the below description, the term transaction is used for sending a request for an object and getting a response to the request. In some cases, for example, when the browser links define a style sheet, then presenting of the text can be delayed until receiving the style sheet. Fetching the browser links during the rendering of the web page increases the time needed for presenting the web page, and has an impact on the experience of the surfer.
Furthermore, fetching the objects by the browser increases the load on the Internet and increases the time for fetching the page due to the overhead of setting new transactions to the plurality of servers at which the objects are stored, as well as the time that it takes to send the request and get the response. This problem is exacerbated when the connection of the surfer's computer is provided over a narrow bandwidth link, such as a cellular link, or when the web-servers are overloaded.
There are some techniques set forth to reduce the impact resulting from fetching of a plurality of objects. Some of the methods convert an HTML file into another file format, such as a bitmap, for example, to be transmitted to a surfer. Another prior art method, which is disclosed in U.S. patent application Ser. No. 11/576,820 uses an intermediate device operating between a surfer who requested a web page and a content server that delivers the web page. The intermediate device intercepts a markup language file that received from the content server and that is directed to a requesting client device. The ML file is parsed by the intermediate device to determine if at least one browser link is identified. If a browser link is identified, the browser object that is associated with the identified browser link is fetched and placed into the ML file instead of the browser link to create a modified markup language file that includes the objects instead of the browser links. The modified markup language file is transferred to the requesting client device. This method can reduce the amount of time that is required to transmit the content to the client device and thus, improves the user experience.
Other techniques aimed at improving the performance of the browser, may use an intermediate device for fetching the browser objects, aggregating the fetched data, combining the browser objects into one or more compound objects, such as multipart object, and transmitting the compound objects to the browser. An exemplary compound object can be a multipart object, XML file, an archive object such as a zip file, etc. The term multipart object and compound object can be used interchangeably and the term multipart can be used as representative name for any of the above group of compound objects. Because the intermediate device fetches the additional files within the HTML file, the client device does not have to make additional data requests to retrieve these additional files. Exemplary prior art methods that implement compound objects are disclosed in U.S. patent application Ser. Nos. 11/686,495, 11/462,355 the contents of which are incorporated herein by reference. Other methods have been published in PCT international publications such as: WO 2007/009255, WO 2007/008291 or in US publications: US 2006/0,271,642 and US 2006/0,224,700.
Although the above-described techniques do provide some level of improvement the user's experience in surfing the web, they have some limitations and fall short of what is needed in the art. For example, some old methods eliminate the benefit of using a common web page. For example, a user cannot copy a section of the converted web page and use it in his document. Other techniques are not aware of the content of the cache in the user's device or may just estimate its content. Such techniques may send browser objects that already exist in the browser's cache. Some of the techniques may require a modification in the user's devices to include a module that can communicate with the intermediate device. The module can inform the intermediate device about the content of the browser's cache, for example. Some of the methods need an additional connection and a unique protocol in order to report the content of the cache.
Therefore there is a need in the art for a system and a method for reducing the number of requests a browser sends during the process of rendering a web page. Furthermore, there is a need for a system that is aware of the content of a requester's device cache and can operate to send only objects that are not in the cache or that have been changed. In addition, there is a need that such a system does not enforce a modification to the user or client devices. Due to the huge number of users, each modification can involve drastic technical and marketing problems. Such a system may reduce the download time of a web page, improve the time to text (which is defined as the time to download and display a requested web page) of the page and reduces the load over the net.