The present invention relates generally to computer files and, more particularly, to the display of computer files.
The Internet is a worldwide decentralized network of computers having the ability to communicate with each other. The Internet has gained broad recognition as a viable medium for communicating and interacting across multiple networks. The World Wide Web (Web) was created in the early 1990""s, and is comprised of server-hosting computers (web servers) connected to the Internet that have hypertext documents or web pages stored therewithin. Web pages are accessible by client programs (i.e., web browsers) utilizing the Hypertext Transfer Protocol (HTTP) via a Transmission Control Protocol/Internet Protocol (TCP/IP) connection between a client-hosting device and a server-hosting device.
An intranet is a private computer network conventionally contained within an enterprise and that conventionally includes one or more servers in communication with multiple user computers. An intranet may be comprised of interlinked local area networks and may also use leased lines in a wide-area network. An intranet may or may not include connections to the outside Internet. Intranets conventionally utilize various Internet protocols and, in general, often look like private versions of the Internet. An intranet user conventionally accesses an intranet server via a web browser running locally on his/her computer.
Exemplary web browsers for both Internet and intranet use include Netscape Navigator(copyright) (Netscape Communications Corporation, Mountain View, Calif.) and Internet Explorer(copyright) (Microsoft Corporation, Redmond, Wash.). Web browsers typically provide a graphical user interface for retrieving and viewing information, applications and other resources hosted by Internet/intranet servers (hereinafter collectively referred to as xe2x80x9cweb serversxe2x80x9d or xe2x80x9cweb sitesxe2x80x9d).
Web content including, but not limited to, information, applications, applets and other video and audio resources (collectively referred to herein as xe2x80x9cfilesxe2x80x9d) are conventionally delivered from a web server to a web browser on a user""s computer in the form of web pages. As is known to those skilled in this art, a web page is conventionally formatted via a standard page description language such as HyperText Markup Language (HTML), and typically displays text and graphics, and can play sound, animation, and video data. HTML provides basic document formatting and allows a web content provider to specify hypertext links (typically manifested as highlighted text) to other servers and files. When a user selects a particular hypertext link, a web browser reads and interprets the address, called a Uniform Resource Locator (URL) associated with the link, connects the web browser with the web server at that address, and makes an HTTP request for the file identified in the link. The web server then sends the requested file to the client in HTML format which the browser interprets and displays to the user.
With the increasing mobility of today""s society, the demand for mobile computing capabilities has also increased. Many workers and professionals are downsizing their laptop computers to smaller palm-top or hand-held devices, such as personal digital assistants (PDAs). In addition, many people now utilize cellular telephones to access the Internet and to perform various other computing functions. Hand-held computing devices including, but not limited to, PDAs, cellular telephones, and computing devices utilized within appliances and automobiles, are often collectively referred to as xe2x80x9cpervasivexe2x80x9d computing devices. Many hand-held computing devices utilize the Microsoft(copyright) Windows CE and 3Com Palm Computing(copyright) platforms.
Unfortunately, hand-held computing devices may have displays that are small in size compared with desktop computer displays. As a result, images and text otherwise displayable on a desktop computer display may not be displayable on a hand-held computing device display. For example, a desktop computer display having an array of 1024 pixels by 800 pixels may be able to display a large (e.g., 2 megabit), 32 bit per pixel color image. A hand-held computing device with a display having an array of 120 pixels by 120 pixels and with the ability to display only about 3 bits per pixel, may ignore much of the image data. As a result the image may not be displayed properly, if at all, via the hand-held computing device display. Furthermore, text within a file may have a particular font or size that can hinder the display thereof within a hand-held computing device display.
Files that may not be displayable via a hand-held computing device display can typically be transformed into a format that is displayable within a hand-held computing device display. For example, large, high resolution, color images can be transformed into small, black and white images that can be displayed within small, low resolution displays. Furthermore, because some web servers can recognize the type of client device requesting a file, files in the proper format for display via the requesting client device can be provided.
Unfortunately, an enormous number of files can reside within a web site on both the Internet and on intranets. Furthermore, an enormous number of files are added every day to web sites. As a result, the task of identifying files within a web site having characteristics that can hinder the display thereof via a hand-held computing device, may be difficult.
In view of the above discussion, it is an object of the present invention to facilitate the identification of web site files that may be difficult to display via hand-held computing devices.
It is another object of the present invention to facilitate the identification of web site files having one or more characteristics that do not comply with other files within a web site.
These and other objects of the present invention are provided by systems, methods and computer program products for identifying files, such as web pages, from among a plurality of hierarchically-related files within a web site, wherein each of the identified files has one or more characteristics that can hinder or prohibit display thereof via a hand-held computing device in communication with the web site. Operations include selecting a file from among a plurality of hierarchically-related files within a web site. The selected file and files hierarchically-related to the selected file are then analyzed via a web crawler configured to identify characteristics that can hinder display of a respective file within a display of a hand-held computing device. For example, the size, font, style and language of text within a file can be analyzed. Also, the format and size of image files can be analyzed.
A directed graph representation of the plurality of hierarchically-related files within a web site is then displayed. Each file having at least one characteristic that can hinder or prohibit display thereof via a hand-held computing device is identified within the directed graph representation. Suggestions as to how to transform an identified file so as to be displayable via a hand-held computing device may also be provided.
According to another embodiment, the present invention may be utilized to determine whether files within a web site comply with a style and/or format. For example, using the present invention, a determination can be easily made whether all web pages within a web site contain text in English.
The present invention can be advantageous because web content providers can quickly and easily identify files having one or more characteristics that may render the display thereof difficult or impossible via a hand-held computing device. Furthermore, the present invention can provide web content providers with suggested formats into which files can be transformed so as to be displayable via a hand-held computing device.