1. Field of the Invention
The present invention relates to online virtual communities and more particularly to a system and method of providing virtual communities based upon users' actions and behaviors and of providing a community oriented search mechanism.
2. Description of Related Art
Virtual communities based on resources that users are currently accessing connect people around the world together while they are visiting the same resource, either web site, web page, software or any other resource. A system and method of providing resource-based virtual communities are disclosed in my co-pending patent application Ser. No. 10/710,964. In the disclosed system, the user uses a regular web browser to browse the Internet and a browser plug-in or helper object will connect to the virtual community based on the resource the user is visiting. The user can connect to all other people in the world who are also visiting the same resource and can communicate or collaborate together by real-time chatting, sharing information, asking for help or exchanging ideas.
Patent application Ser. No. 10/710,964 discloses a new type of online communities—resource based virtual communities which allows any people from anywhere in the world to connect to anyone else who is also visiting the same resource together by ways of dynamically participating into the resource-based virtual communities. Automatically joining the corresponding virtual communities while accessing a resource, such as browsing the web, a user can conduct more collaborative works with others who are also using the same resource, or doing the same thing as the user.
Systems and methods of the prior suffer the problem of not being operable to group users performing a same or similar action on a network together into a community that allows the users to collaborate and interact with each other.
A publication by Yuwono et al. entitled “Search and Ranking Algorithms for Locating Resources on the World Wide Web”, IEEE 1996, pp. 164-171 presents four keyword-based search and ranking algorithms for locating relevant WWW pages with respect to user queries. The first algorithm, Boolean Spreading Activation, extends the notion of word occurrence in the Boolean retrieval model by propagating the occurrence of a query word in a page to other pages linked to it. The second algorithm, most-cited, uses the number of citing hyperlinks between potentially relevant WWW pages to increase the relevance scores of the referenced pages over the referencing pages. The third algorithm, TFxIDF vector space model, is based on word distribution statistics. The last algorithm, Vector Spreading Activation, combines TFxIDF with the spreading activation model. [Para 5] The authors conducted an experiment to evaluate the retrieval effectiveness of these algorithms. The publication concerns the nature of the WWW environment with respect to document ranking strategies. However, it does not relate to any interaction between users and the backend servers.
A publication entitled “An Interactive WWW Search Engine for User-Defined Collections” (http://www.ils.unc.edu/iris, 1998) by Robert G. Sumner, Jr. et al. at School of Information and Library Science in University of North Carolina at Chapel Hill discloses the IRISWeb system. Given the dynamic nature and the quantity of information on the WWW, many individual users and organizations compile and use focused WWW resource lists related to a particular topic or subject domain. The IRISWeb system extends this concept such that any user-defined set of WWW pages (a virtual collection) can be retrieved, indexed, and searched using a powerful full-text search engine with a relevance-feedback interface. This capability adds full-text searching to highly customized subsets of the WWW.
The IRISWeb system allows users to enter a seed URL and the system collects and indexes all resources linked from the seed page. The indexing process follows links from the seed page to a depth specified by the user. A virtual collection is searchable using a WWW-based interface that relies on relevance feedback for interactive refinement of search queries. Although creating virtual collections from user-defined lists of WWW resources holds promise in a number of application domains, it is similar to traditional web crawling methods and user-interaction are limited to refining searches.
Pratyusa Manadhata and Priyank Porwal published a paper “PtoP—A Peer-to-Peer Search Engine” (April 2001). Traditional search engines are very useful tools for searching specific information in World Wide Web (WWW). But they lack the ability to index and hence search the dynamic content of the web, which is growing at a much faster rate than the static content. The information stored in the searchable databases of deep websites, which is around hundreds of times more than the static content quantitatively and 3.4 times better qualitatively, can only be searched by direct query to the database. But the process of “one at a time” direct query to different deep websites is a time consuming and laborious process. It does not disclose a peer to peer search engine that automates the process of sending queries to these deep websites using peer-to-peer technology and presents the search result from all the sites to the user. This method aims to search deep web by sending direct queries to the server's underlying searchable database using a large number of peer sites to propagate user queries.
Martin P. Courtois and Michael W. Berry in their paper “Results Ranking in Web Search Engines” (May 1999, ONLINE, at http://www.onlineinc.com/online/OL1999/courtois5.html) discussed a study test how five major search engines retrieve and rank documents in answer to sample search queries. The basic premise of relevancy searching is that results are sorted, or ranked, according to certain criteria. Criteria can include the number of terms matched, proximity of terms, location of terms within the document, frequency of terms (both within the document and within the entire database, document length, and other factors. The exact “formula” for how these criteria are applied is the “ranking algorithm” and varies among search engines. From the study, the proprietary nature of ranking algorithms makes them difficult to explore. The algorithms are under constant adjustment, both to increase their effectiveness and to prevent reverse engineering by WWW optimization firms. Still, both search engine producers and end-users would benefit from increased attention by information professionals to this important element of Web searching.
Some search engines do provide interactive features such as in the form of toolbars or search multiple sites simultaneously. Ask Jeeves (http://www.ask.com) which bought Interactive Search Holdings is one example. However, most of Interactive Search Holdings' properties are multi-search engines, offering a choice of results from five major search engines, including Google, AlltheWeb™, AltaVista™, Ask Jeeves™ or LookSmart™. Google™ is the default provider of search results for all of Interactive Search Holdings' destination sites other than Excite™, which is powered by metasearch provider InfoSpace™ So on the searching software, essentially there is not much difference from other traditional search engine.
Weifeng Zhang et al from Department of Computer Science and Engineering, Southeast University, China in paper “Development of a Self-adaptive Web Search Engine” (November 2001, WSE2001 Florence, Italy, p 86-93) talk about a self-adaptive system provided with feedback ability. As current web search engines produce search results related with the search terms and the actual information collected by them, the selections of the search results cannot affect the future ones, so they cannot cover most people's interests. In this paper, feedback signals produced by the users' accessing lists can influence the search results. And thus the search engines can provide self-adaptability. This paper proposed a self-adaptive search engine (ASE) that is made up of the feedback information collecting and producing agent, the feedback repository, the search results adjuster (the agent to adjust the search results) and the pre-search engine.
The feedback information collecting and producing agent records the information about the users' choices among the search results and the processed results are stored in the feedback information repository. According to the users' query requests, the pre-search engine in the ASE can search the corresponding information from the index database and then send to the search result adjuster in some format, which mainly takes charge of the integration of the last search results. Although it considers user feedback as an input, it does not return all the user's choices back to the querying users to help new users to decide their choices.
Yoshinori Hijikata of Osaka University, Osaka, Japan presented a paper “Implicit user profiling for on demand relevance feedback” (2004, International Conference on Intelligent User Interfaces, Funchal, Madeira, Portugal, p 198-205) about relevance feedback which searches similar documents based on the documents browsed by the user. If the user wants to conduct relevance feedback on demand, which means the user wants to see similar documents while reading a document, the existing user profiling techniques cannot acquire keywords in high precision that the user is interested in at such a short time. This paper proposes a method for extracting text parts which the user might be interested in from the whole text of the Web page based on the user's mouse operation in the Web browser.
The objective of this research is to (1) find what kind of mouse operation represent users' interests, (2) see the effectiveness of the found mouse operation in selecting keywords, and (3) compare our method with tf-idf, which is the most fundamental method used in many user profiling systems. From the user experiment, the precision to select keywords of our method is about 1.4 times compared with that of tf-idf. This paper is mainly about methods to detect user interests based on mouse movement and does not cover how collective users in the web can collaborate information to share search results.
The shopping search of InStore™ (www.in-store.com) and the later Google™'s suggestion method (http://www.google.com/webhp?complete=1) is another interactive method utilizing XMLHttpRequest in the browser to send user feedback instantly about matching numbers while users are still typing keywords in the input box. This method solely provides interactively between one user and servers and does not include collaboration efforts among other users.
Many database systems support query or search capabilities and some support interactive searches. One example is the Domino Notes™ application (http://www-128.ibm.com/developerworks/lotus/library/appstrat-search/2/2004) from IBMT™. On one hand, the notes application allows users to enter more keywords or add more options step by step. On the other hand, the interactive search support allows users to define key words, questions and answers as they are adding documents. To enable keyword searching, the document author adds one or more relevant words in the Keywords field. The Keywords field allows multiple values and allows values not in the list, so document authors can add new values as needed. As more and more documents are given keywords, the richness of searching by keyword increases.
U.S. Pat. No. 6,799,176 (Sep. 28, 2004; 707/5) to Page et al. entitled “Method for scoring documents in a linked database” discloses a method for scoring documents stored in a network. The method includes identifying links from linking documents to linked documents in the network and determining an importance of the identified links. The method further includes weighting the identified links based on the determined importance and scoring the linked documents based on the weighted links. This disclosure is focused on the static analysis of the links among documents.
U.S. Pat. No. 6,278,992 to Curtis et al. entitled “Search engine using indexing method for storing and retrieving data” discloses a Search Engine utilizing a method and system for efficient storage and retrieval of data. The system comprises a record file, an index file, a duplicate segment file and access to a network of computers. The index files contain locations of data items, pointers to other index files, or an empty designation. The index files are arrays that contain locations corresponding to a predetermined range of characters with which the data items may be formed. Data items are stored according to the character strings of each data item. The first portion of a data object is indexed according to the indexing method of the present invention while a second portion of the data object is indexed according to another known database technology, such as B-tree. This disclosure is about how to organize, store and retrieve data files.
U.S. Pat. No. 6,845,374 to Oliver, et al. entitled “System and method for adaptive text recommendation” discloses a network system that provides a real-time adaptive recommendation set of documents with a high statistical measure of relevancy to the requestor device. The recommendation set is optimized based on analyzing text of documents of the interest set, categorizing these documents into clusters, extracting keywords representing the themes or concepts of documents in the clusters, and filtering a population of eligible documents accessible to the system utilizing site and or Internet-wide search engines. The system is either automatically or manually invoked and it develops and presents the recommendation set in real-time. The recommendation set may be presented as a greeting, notification, alert, HTML fragment, fax, voicemail, or automatic classification or routing of customer e-mail, personal e-mail, job postings, and offers for sale or exchange. This disclosure talks about the adaptive recommendation set, but does not cover its usage on search engines.
U.S. Pat. No. 6,766,316 to Caudill, et al, (Jul. 20, 2004, 707/3;) titled “Method and system of ranking and clustering for document indexing and retrieval” discussed about a relevancy ranking and clustering method and system that determines the relevance of a document relative to a user's query using a similarity comparison process. Input queries are parsed into one or more query predicate structures using an ontological parser. The ontological parser parses a set of known documents to generate one or more document predicate structures. A comparison of each query predicate structure with each document predicate structure is performed to determine a matching degree, represented by a real number. A multilevel modifier strategy is implemented to assign different relevance values to the different parts of each predicate structure match to calculate the predicate structure's matching degree.
In this disclosure, the relevance of a document to a user's query is determined by calculating a similarity coefficient, based on the structures of each pair of query predicates and document predicates. Documents are autonomously clustered using a self-organizing neural network that provides a coordinate system that makes judgments in a non-subjective fashion. This system and method is still an addition to the traditional ranking method.
U.S. Pat. No. 6,842,748 to Warner et al. entitled “Usage based strength between related information in an information retrieval system” discloses an information retrieval system that maintains a database that defines a relational association between a plurality of informational items in the system. The relational association is based on historical navigational behavior of users of the information retrieval system, and includes a relationship type, which is based on the characteristic similarities between the informational items, and relationship strength, which is based on the historical frequency of any related informational items being selected by a user within the same information retrieval session. When navigation from one informational item to another information item is detected, the relationship type and the relationship strength of the two informational items are determined and stored in the database. During a subsequent selection of an informational item, any related informational items related to the selected informational item are sorted based on the respective relationship types and relationship strengths, and are provided in a sorted list from which the user can select. This disclosure is mainly about the relational association but not the method of utilizing that kind of information.
U.S. Pat. No. 6,704,729 to Klein et al. entitled “Retrieval of relevant information categories” relates to large stores of information that are often organized in a hierarchical taxonomy to aid a search and retrieval of the information. The hierarchical taxonomy generally consists of related categories of information, called “nodes,” that each may contain information relevant to the search. Each node is addressable according to its path in the hierarchical taxonomy. In information stores where the number of nodes having relevant information is extremely large, such as the Internet, providing a cohesive, intelligent, and organized display of the search results becomes extremely important to the success of a user traversing the store to find relevant information. The invention provides such search results by ranking each node of the taxonomy to determine which nodes are most likely to be relevant to the search request. The invention then creates a conceptually-related “cluster” of nodes by selecting a relevant “seed” node and relevant nodes related to the “seed” node.
U.S. Pat. No. 6,718,324 to Edlund et al. entitled “Metadata search results ranking system” utilizes a combination of popularity and/or relevancy to determine a search ranking for a given search result association. Given the exponential growth rate currently being experienced in the Internet community, the present invention provides one of the few methods by which searches of this vast distributed database can produce useful results ranked and sorted by usefulness to the searching web surfer. The present invention permits embodiments incorporating a Ranking System/Method (0100) further comprising a Session Manager (0101), Query Manager (0102), Popularity Sorter (0103), and Association (0104) functions. These components may be augmented in some preferred embodiments via the use of a Query Entry means (0155), Search Engine (0156); Data Repository (0157), Query Database (0158), and/or a Resource List (0159). That invention examines the user's behavior by monitoring all the hyperlinks the user clicks on and stores the association information in a database system. It requires quite complex external components and also does not mention real time interactivity among users.
U.S. Pat. No. 6,675,159 to Lin et al. (Jan. 6, 2004, 707/3;) titled “Concept-based search and retrieval system” talks about a concept-based indexing and search system indexes collections of documents with ontology-based predicate structures through automated and/or human-assisted methods. The system extracts the concepts behind user queries to return only those documents that match those concepts. The concept based search and retrieval system comprehends the intent behind a query from a user, and returns results matching that intent. The system can perform off-line searches for unanswered user queries and notify the user when a match is found. Mainly this patent discussed the way to extract concepts behind user queries.
U.S. Pat. No. 6,665,655 to Warner et al. entitled “Implicit rating of retrieved information in an information search system” discloses an information retrieval system that allows a user to search a database of informational items for a desired informational item, and presents the search result in the form of matching index entries in the order of relevance. The information retrieval system in accordance with the principles of the present invention assigns a relevance rating to each of the index entries without requiring an explicit input from the user with respect to the usefulness or the relevance of the retrieved information corresponding to the respective index entries. When the user selects and retrieves an informational item through a list of index entries presented by the retrieval system, as a result of a search, the relevance rating of the selected informational item is increased by a predetermined amount. The relevance rating of the selected informational item is further adjusted based on any actions the user takes subsequent to the initial selection of the informational item if the subsequent act indicates that the relevance of the selected informational item may be less than what is reflected by the rating increase by the predetermined amount.
Ratings of the informational items in the database are determined from implicit suggestions from the usage of the retrieval system and the database by the user rather than from an explicit user input. In another aspect of the present invention, the ratings are allowed to decay over time to minimize the tendencies for historical usage biased rating, and to provide more temporally accurate ratings. The most recently accessed time of each of the informational items in the database is compared to a predetermined stale access time threshold, and if the most recently accessed time is older than the threshold, than the rating of the corresponding informational item is decreased to reflect the dated nature of the information contained within the item.
In 1999, Bamshad Mobasher of DePaul University et al., published an article entitled “Creating adaptive Web sites through usage-based clustering of URLs”, in Knowledge and Data Engineering Exchange, 1999. (KDEX '99). In this paper they describe an approach to usage-based Web personalization taking into account both the offline tasks related to the mining of usage data, and the online process of automatic Web page customization based on the mined knowledge. Specifically, they propose an effective technique for capturing common user profiles based on association-rule discovery and usage-based clustering. They also propose techniques for combining this knowledge with the current status of an ongoing Web activity to perform real-time personalization. Finally, they provide an experimental evaluation of the proposed techniques using real Web usage data. This paper also describes a usage based approach, and does not specify how the approach is applied for searching process.
In Jan. 29, 1998, Page et al. from Stanford Digital Library Technologies Project describes page rank in the paper “The PageRank citation Ranking: Bringing Order to the Web”. (Online., pp. 1-17). This paper describes PageRank, a method for rating Web pages objectively and mechanically, effectively measuring the human interest and attention devoted to them. They compare PageRank to an idealized random Web surfer. They also show how to efficiently compute PageRank. This paper is focus on the objective properties of web pages and does not consider user interactions with web pages.
U.S. Pat. No. 6,584,471 to Maclin et al. entitled “System and method for the adaptive, hierarchical receipt, ranking, organization and display of information based upon democratic criteria and resultant dynamic profiling” discloses a system and method for receiving, organizing and displaying information received from a plurality of users, having a hierarchical database with at least one expandable level, at least one of the at least one expandable levels having at least two expandable sublevels; an interactive interface for placing each user at a level and sublevel in the database, receiving user-supplied information for modification and addition to the content and structure of the levels and sublevels of the database, receiving user-supplied commands for navigating through and extracting content from the database, presenting content from the database in accordance with the information and commands supplied; a counting routine which increments a level and sublevel specific counter each time a user is placed at a level and sublevel within the at least two expandable levels and sublevels; and a database sorting function for gathering the plurality of user-supplied information and commands virtually simultaneously, and updating the hierarchical structure of the database in accordance with the magnitude of the number of the specific counter.
The database levels and sublevels are categories, subcategories and query items. The commands are search, add, select and interact. Additionally shown is a data base search engine for receiving database search commands from at least one user, searching the database for matches, and presenting the results of the search. The system dynamically profiles members of an on-line community in that it allows a member to see those who have answered query items in a subcategory and category in a manner suggestive of some desired similarity in character. This invention focuses on the profiling aspect of an online community.
U.S. Pat. No. 6,546,388 to Edlund et al. entitled “Metadata search results ranking system” is about a method for presenting to an end-user the intermediate matching search results of a keyword search in an index list of information. The method comprising the steps of: coupling to a search engine a graphical user interface for accepting keyword search terms for searching the indexed list of information with the search engine; receiving one or more keyword search terms with one or more separation characters separating there between; performing a keyword search with the one or more keyword search terms received when a separation character is received; and presenting the number of documents matching the keyword search terms to the end-user presenting a graphical menu item on a display. This patent discloses a system and method of metadata search ranking and utilizes a combination of popularity and/or relevancy to determine a search ranking for a given search result association. Its focus is metadata search and provides no interactivity among users or search engine.
U.S. Pat. No. 6,523,037 to Monahan et al. entitled “Method and system for communicating selected search results between first and second entities over a network” discloses a system and method in which a search result set is communicated to a first user as hypertext descriptions of data items constituting the result search set. A check box is displayed adjacent each hypertext description, utilizing which the first user may select a subset of the search result set. This subset is then communicated to a second user, again as hypertext descriptions of the subset. Check boxes are furthermore displayed to the second user so as to enable the second user to select a further and narrower subset of the search result set. Selected items of a result set of an Internet-based search may in this way conveniently be communicated between first and second users. Each of a number of interfaces provides the hypertext descriptions of the data items of the search result set. This disclosure does not cover group user's interactions.
U.S. Pat. No. 6,397,212 to Biffar entitled “Self-learning and self-personalizing knowledge search engine that delivers holistic results” discloses a search engine that provides intelligent multi-dimensional searches, in which the search engine always presents a complete, holistic result, and in which the search engine presents knowledge (i.e. linked facts) and not just information (i.e. facts). The search engine is adaptive, such that the search results improve over time as the system learns about the user and develops a user profile. Thus, the search engine is self personalizing, i.e. it collects and analyzes the user history, and/or it has the user react to solutions and learns from such user reactions.
The search engine generates profiles, e.g. it learns from all searches of all users and combines the user profiles and patterns of similar users. The search engine accepts direct user feedback to improve the next search iteration. One feature of the invention is locking/unlocking, where a user may select specific attributes that are to remain locked while the search engine matches these locked attributes to all unlocked attributes. This patent also does not mention real time characteristics.
U.S. Pat. No. 6,370,526 by Agrawal et al. entitled “Self-adaptive method and system for providing a user-preferred ranking order of object sets” discloses a system and method in which objects are ranked according to user preferences by first observing the access order of a related group of objects in relation to a predetermined access hypothesis. A user preference model is then adapted to correspond to any deviations between the access order and the access hypothesis for the related group of objects. Next, object preferences are calculated for each of the objects to be ranked according to the preference model. The group of objects is then presented to the user in an order corresponding to the calculated object preferences. In this patent, the preference model is adaptively updated, unbeknownst to the user, in the normal course of accessing the presented objects. This patent does not cover interactivity among users.
U.S. Pat. No. 6,314,420 to Lang et al. entitled “Collaborative/adaptive search engine” discloses a portal site on the internet. The search engine system employs a regular search engine to make one-shot or demand searches for information entities which provide at least threshold matches to user queries. The search engine system also employs a collaborative/content-based filter to make continuing searches for information entities which match existing wire queries and are ranked and stored over time in user-accessible, system wires corresponding to the respective queries. In this patent, a user feedback system provides collaborative feedback data for integration with content profile data in the operation of the collaborative/content-based filter. A query processor determines whether a demand search or a wire search is made for an input query. This patent does not cover group or real time communications.
U.S. Pat. No. 6,282,534 to Vora entitled “Reverse content indexing” discloses a method for generating knowledge base entries by using a programmed computer with access to said knowledge base is disclosed. In one embodiment, after information is collected from content providers, the present invention attempts to generate questions which relate to the collected information. Such questions are presented to a non-expert user in view of received information. If the information does not answer the generated questions, the user modifies the questions so that the information becomes the solutions to them. Then these questions and solutions pairs are entered into the knowledge base. This patent does not cover collaborative relationship between users and search engines.
U.S. Pat. No. 6,266,668 to Vanderveldt et al. entitled “System and method for dynamic data-mining and on-line communication of customized information” discloses a method and system for dynamically searching databases in response to a query is provided by the present invention. More specifically, a system and method for dynamic data-mining and on-line communication of customized information. This method includes the steps of first creating a search-specific profile. This search-specific profile is then input into a data-mining search engine. The data-mining search engine will mine the search-specific profile to determine topic of interests. These topics of interest are output to at least one search tool. These search tools match the topics of interest to at least one destination data site wherein the destination data sites are evaluated to determine if relevant information is present in the destination data site. Relevant information is filtered and presented to the user making the inquiry.
US Patent Application Publication No. 20050071328 to Lawrence, Stephen R. entitled “Personalization of web search” discloses a system and method for creating a user profile and for using the user profile to order search results returned by a search engine. The user profile is based on search queries submitted by a user, the user's specific interaction with the documents identified by the search engine and personal information provided by the user. Generic scores associated with the search results are modulated by the user profile to measure their relevance to a user's preference and interest. The search results are re-ordered accordingly so that the most relevant results appear on the top of the list. User profiles can be created and/or stored on the client side or server side of a client-server network environment. This patent application uses user profile as a way to influence the search result but does not contain interactions.
United States Patent Application No. 20040260688 to Gross entitled “Method for implementing search engine” discloses a system and method for implementing/influencing a search engine which provides search results to users based on characteristics of certain trendsetter web pages identified on the Internet. The trendsetter web pages are determined by studying historical adoption behavior of a group within the universe of websites, or by reference to known indicia. This application does not cover real-time interactions as well.
United States Patent Application Publication No. 20030158839 to Faybishenko et al. entitled “System and method for determining relevancy of query responses in a distributed network search mechanism” discloses a system and method for selecting or ordering search results received from members of a distributing search network in response to a search request. Network nodes operating as consumer or requesting nodes generate the search requests. Nodes operating as hubs are configured to route the search requests in the network. Individual nodes operating as provider nodes receive the search request and in response may generate results according to their own procedures and return them.
Communication between nodes in the network may use a common query protocol. Hub nodes may resolve the search requests to a subset of the provider nodes in the network, for example by matching search requests with registration information from nodes. Search results may be selected, ordered, and/or consolidated for use by the requesting nodes by nodes receiving a plurality of the search results. This is more like a peer to peer approach to search engines.
US Patent Application Publication No. 20010047290 to Petras et al. entitled “System for creating and maintaining a database of information utilizing user opinions” discloses a system for automatically creating and maintaining a database of information utilizing user opinions about subjects, particularly exceptional experiences. Described is an Internet system assisting/motivating a population of users interested in information about certain categories of subjects to automatically maintain the database content and to improve the usefulness and quality of the database information without any substantial management by the website owner-manager. The user opinions are primarily in the form of both comments and ratings about which natural-language terms best describe a particular subject, enabling user searches of the subject database to be by way of preferred such descriptive natural-language terms, which terms are further preferred to be evaluative and approving. The disclosed system also utilizes user opinions for database information but the interactivity between users and their communities are unclear.