§ 1.1 Field of the Invention
The present invention concerns reformatting content. In particular, the present invention concerns reformatting content intended to be rendered on a larger display screen (such as typical 15 inch to 21 inch monitors used with personal computers for example), for rendering on a smaller screen. The present invention also concerns identifying navigation bars in general. Such general navigation bar identification may be useful in other applications, such as for purposes of determining whether or not to index text, or whether or not to show text in a snippet search result, for example.
§ 1.2 Related Art
The description of art in this section is not, and should not be interpreted to be, an admission that such art is prior art to the present invention. The Internet and traditional modes of Internet access are introduced in § 1.2.1 below. The growth of information access via portable (e.g., wireless) devices having smaller display screens, as well as challenges related to such access and rendering, are then introduced in § 1.2.2 below.
§1.2.1 The Internet and Traditional Modes of Internet Access
In recent decades, and starting in the 1990s in particular, computers have become interconnected by networks by an ever increasing extent; initially, via local area networks (or “LANs”), and more recently via LANs, private wide area networks (or “WANs”) and the Internet. The proliferation of networks, in conjunction with the increased availability of inexpensive data storage means, has afforded computer users unprecedented access to a wealth of content. Such content may be presented to a user (or “rendered”) in the form of text, images, audio, video, etc. Such content is referred to as “documents” or “pages” without loss of generality.
Since the Internet permits many different computers, platforms, and applications to access information, standard “markup languages” have been adopted so that documents retain formatting, indexing, and linking information, without regard to the type of computer, platform, or application supported on the device accessing and rendering the documents. The most common markup languages are briefly introduced below.
Initially, the standard generalized markup language (or “SGML”) was adopted by the International Organization for Standardization (“the ISO”) in 1986 as a means for providing platform and application independent documents that retain formatting, indexing, and linked information. SGML does so by providing a grammar-like mechanism for users to define the structure of their documents, as well as tags used to denote the structure in individual documents.
The hypertext markup language (or “HTML”) is an application of SGML used for documents on the World Wide Web. HTML uses tags to mark elements (such as text and graphics for example) in a document to indicate how Web browsers should display such elements, as well as to indicate how Web browsers should respond to user actions, such as the activation of a link (such as by a mouse click for example).
Finally, the extensible markup language (or “XML”) is a condensed form of SGML which lets Web developers and designers create customized tags that offer greater flexibility in organizing and presenting information than is possible with HTML.
§ 1.2.2 The Growth of Content Access Via, and Content Rendering on, Devices with Smaller Display Screens
The markup languages introduced above permit Web document authors and designers to format their documents to effectively communicate information to their intended audience. Web document authors and designers often consider many design factors, such as ease of navigation, logical organization, consistency, efficient downloading, etc. The display device on which the intended audience is expected to render such Web documents is a major consideration in the design and authoring of Web sites and Web documents. Traditionally, most access to documents on the World Wide Web has been by means of personal computer, most of which are equipped with 15 inch to 21 inch monitors, and many of which support resolutions of 640-by-480 pixels, and beyond. Accordingly, most documents on the World Wide Web are designed for display on desktop computers with color monitors having at least 640-by-480 resolution.
Internet access and the rendering of Web documents by devices with smaller displays, such as personal electronic devices, cell phones, mobile phones, and other portable and/or untethered information access and communication devices, is rapidly increasing. The wireless markup language (or “WML”) has been introduced to facilitate authoring documents for rendering on devices with limited display area, limited memory and processing resources, and lower bandwidth connections.
§ 1.2.2.1 Challenges Related to Reformatting Content for Rendering on Smaller Display Screens
The intended audience for many Web sites and Web documents will often include users accessing and rendering such Web sites and web documents with traditional desktop computers having larger displays, as well as users accessing and rendering such Web sites with devices having smaller displays (as well as users who will use both types of devices). Such a heterogeneous audience using various types of devices presents a dilemma for web authors and web designers—how can different intended audiences, using different access and/or rendering devices be satisfied?
The article, T. Bickmore et al., “Digestor: Device-independent Access to the World Wide Web,” Proc. Of the Sixth World Wide Web Conference, downloaded from http://decweb.ethz.ch/WWW6/Technical/Paper177/Paper177.html (hereafter referred to as “the Digestor article”, which is expressly incorporated herein by reference), notes that there are four general approaches to displaying Web documents on devices with small display screens: namely, device-specific authoring, multiple-device authoring, client-side navigation, and automated re-authoring.
As their names imply, device-specific authoring and multiple-device authoring involves authoring separate Web documents for specific devices and groups of devices, respectively. Obviously, such approaches require more labor for authoring multiple versions of the same underlying content, require more storage to store such multiple versions of the same underlying content, and require special signaling protocols to determine which device is requesting the Web document.
In the client-side navigation approach, the user can interactively navigate a single web page by altering the portion of it that is displayed at any given time (e.g., by scrolling, zooming, panning, expanding, collapsing, etc.). However, it is believed that such approaches are not well suited for devices with limited memory and processing resources, as well as limited bandwidth access. Further, it is believed that such required user interaction would become annoying to many users.
Automated re-authoring involves transforming an arbitrary Web document, such as a Web document designed for a desktop computer with a typical display device, to a document that can be appropriately displayed on a target display device. This transformation may take place on the client device, on a server, and/or on an intermediate proxy server. Many factors should be considered in designing a satisfactory automated re-authoring tool. For example, it may be desirable for the re-authoring process to treat different individual components of a Web document differently.
It is a goal of the present invention to provide a utility that could be used as a part of a re-authoring process. In this regard, it is a goal of the present invention to detect special components, such as navigation bars and/or objectionable navigation bars for example, of Web documents, so that such components may receive special treatment by a re-authoring process.