Finding relevant, accurate business information on the Web in an efficient manner can still be a challenge. While a variety of national, regional and local Web sites and portals provide basic information for consumers, those sites have many shortcomings. For example, a consumer can find basic information about restaurants such as restaurant type, street address, phone number and hours of operation with just a few key-strokes or mouse clicks. Additional information, which can be critical to selecting a restaurant, is more difficult or just impossible to determine without an extensive search of the establishment's Web site. For example, a consumer may wish to know about payment options, availability of a kids menu, approximate price range, dress code or the daily specials. A phone call to the restaurant may be necessary to clearly answer all the questions. That becomes impractical when even only a few possible restaurants are identified by a quick search.
Search engines such as Google™ and Yahoo!® search for information on the Web. The information may be Web pages, images and other types of files. The search function generally includes three steps: visiting and caching Web pages, Web indexing and presenting search results to the user.
Search engines may employ a Web crawler, an automated program that browses the Web to constantly search or crawl and return the most recent revision of pages encountered to be copied or cached. The dynamic nature of the Web is that pages are constantly being added, changed or deleted. Once pages are cached, they can be processed with an algorithm or algorithms to rank or index them. That process is referred to as Web indexing, with the purpose of optimizing speed and performance when a specific search request is made. Web pages are typically crawled with some frequency such as daily, weekly or monthly, which can lead to outdated or incorrect information being returned by a search.
Web pages are designed to be viewed or read by people. However, during the process of indexing the Web pages, words, context and intended meaning may be inadvertently changed or lost as a result of the indexing process itself. Natural language processing is the study of the problems of automated generation and understanding of natural human language. Indexing algorithms must accurately capture the intended meaning of pages they encounter or the indexing will be inaccurate, ultimately leading to inaccurate search results presented back to the user.
Metadata is defined as “data about data,” of any sort, in any media. Metadata may describe an individual data item in a database (DB) such as an individual customer name or account number, or a collection of data such as an entire customer record as determined by its context and how it may be used. Metadata can be used to speed up and improve a quality of a search by saving the users from performing more complex query filter operations manually.
The use of metadata can improve Web indexing by providing context or otherwise improving the understanding of the data. Metadata, however, is frequently used by Web page designers to provide unseen key words or phrases to search algorithms. That data is not really “data about data” at all. Many search engines now have algorithms to screen out such extraneous metadata. The screening algorithms are not infallible, so there can be instances where inappropriate results may be presented that have escaped the best efforts of the Web indexing algorithms.
A tag is metadata in the form of a user-selected word or term associated with or assigned to a piece of information. The tag describes that information. This is in contrast to hierarchical systems that use traditional “tree” structures where folder and sub-folder arrangements are used. Tagging allows users to quickly and easily attach multiple tags, and change or delete tags. Both data and metadata can be tagged. For example, computer files, audio files, video files or playlists, Web sites, Web pages, Internet bookmarks of favorite Web sites and many other data types may be tagged. A Web page hosted on a Web server or blog server that supports tagging, might have the tags “Baseball,” “Yankees,” “Tickets,” “Away Games,” and “Discounts.” A human reader may be able to tell the purpose of the page by quickly scanning the list of tags, in this case discounted Yankees baseball tickets for an away game.
Specially designed server software may be used for tracking, updating and facilitating searching with tags, and utilizing suitable algorithms to improve the efficiency and effectiveness of multiple-tag searches. In this example, the server software may display the tags in a list on a page, with each tag displayed as a Web link leading to an index page listing all Web pages that use that tag. This could allow a reader to quickly locate pages that have been associated with a tag or group of tags. If the author of the Web page would like to change the way the page is found through a search, the list of tags can be changed.
While using tags in such an organizational system is flexible and easy, tagging is not without its drawbacks. Typically there is no information about the meaning or semantics of a tag. For example, the tag “Lab” could refer to a Labrador retriever, a national lab such as Los Alamos National Lab or any company or person with “lab” in their name. This lack of semantic distinction in tags can lead to inappropriate connections between items yielding inappropriate search results. Additionally, selection of tag terms is largely non-standardized and may be user specific. Users may use drastically different terms to describe the same concept. Users of tagging systems must make judgments, based on the number of connections and the choices of tag terms and whether possible connections between items are valid for their particular use or interests.
Tag classification and the concept of connecting sets of tags between Web/blog servers, has led to the rise of “folksonomy” classification, the concept of social bookmarking, and other forms of online communities and social networking software. Folksonomy is defined as the method of collaboratively creating and managing tags to annotate and categorize links and/or content. Larger-scale folksonomies tend to address some of the problems of tagging, as astute users of tagging systems will monitor/search the current use of tag terms within these systems, and tend to use existing tags in order to easily form connections to related items. In this way, evolving folksonomies define a set of tagging conventions through eventual group consensus.
Although tagging is often promoted as an alternative to organization by a hierarchy of categories, more and more online resources use a hybrid or mixed system, where items are organized into broad categories, with finer classification distinctions being made by the use of tags.
vCard is a file format for electronic business cards. vCards are exchanged electronically, and are often attached to email messages or copied from the Web. The format has provisions for name and address information, phone numbers, URLs, logos, photographs, audio clips and more.
It would therefore be desirable to provide a technique for searching distributed Internet networks for a particular information type, while minimizing or eliminating unwanted or incorrect search results. There is furthermore a need for more focused searching techniques that yield results that are difficult to obtain using a natural language or traditional tag search. To the inventors' knowledge, no such system or method currently exists.