1. Field of the Invention
The present disclosure relates to a method and system for locating data or resources in a distributed environment, such as a network of terminals. More particularly, the present disclosure relates to a method and system for collaboratively searching and locating data and/or resources in such a network.
2. Description of the Prior Art
The development and ever-increasing interfacing of distributed environments, for instance networks of data processing terminals of varying scales, have greatly facilitated the potential for users to satisfy any information need. As an ever-increasing number of interfaced distributed environments eventually contributes to a define a potentially limitless repository of data and/or resources, within which such data and/or resources are not inventoried in any way, locating specific data and/or resources for satisfying a specific information need becomes comparatively ever more difficult.
Within this context, synchronous collaborative information retrieval (‘SCIR’) is the study of systems to support two or more people searching together in order to satisfy a shared information need.
Remote SCIR
Remote synchronous collaborative information retrieval systems enable a plurality of users, remote from one another, to collaboratively search and browse distributed environments, for instance the Internet. Early examples of synchronous collaborative information retrieval tools were built using a distributed architecture, wherein software enabled communication across groups of remote users. These systems often required users to log into a particular service or may require users to use particular applications in order to facilitate collaboration. Examples of early collaborative browsing environments have included:                the GroupWeb system (Greenberg and Roseman, 1996), which was built upon the GroupKit groupware toolkit and wherein several users could log onto a collaborative browsing session in which the web browser was used as a group “presentation tool”;        the W4 browser (Gianoutsos and Grundy, 1996), which extended the GroupWeb system to allow users to browse the web independently, whilst viewing all pages viewed by other users, dialoguing electronically with each other, sharing documents deemed relevant;        the CSCW3 application (Gross, 1999), which used a chatroom metaphor wherein users in the same room could dialogue electronically and couple their browsers in order to support synchronized browsing; and        the MUSE system (Krishnappa, 2005) employed a similar approach, whereby two users could explore the web and share results and chat using separate windows.        
Laurillau and Nigay (2000) identified four types of navigational support in a collaborative browsing system:                (1) “Guided tour”: the guide navigates the web, and the other members of the group follow synchronously.        (2) “Relaxed navigation”: an open group without a leader, wherein each member explores independently.        (3) “Coordinated navigation”: no leader, but with each member is given a subset of the information space to explore.        (4) “Cooperative navigation”: the leader decides about partitioning the information space, group members work independently and, at the end of the session, the group leader coordinates the results.        
The systems described above can be classified into the first two of these navigational types. Laurillau and Nigay (2000) developed the Co-Vitesse system to support all four types of navigation and a chat facility was also included to support communication. In order for a collaborative system of type 3 or 4 to be effective, an appropriate division of labor is required. Some proposed approaches to a division of labor were discussed by Diamadis and Polyzos (2004), Foley et al. (2006), Morris and Horvitz (2007) and included initially splitting the search task corpus and dividing it amongst the users at the start, or dynamically dividing search results amongst users in real-time.
SearchTogether (Morris and Horvitz, 2007) is a recently-developed prototype system, which incorporates many synchronous and asynchronous tools to enable a small group of remote users to work together for satisfying a shared information need. SearchTogether was built to support awareness of others, division of labor and persistence of the search process. Awareness of others was achieved by representing each group member with a screen name and photo: each time a team member performs a new search, the query terms would be displayed in a list underneath their photo and this list was interactive. By clicking on a search query, a user could see the results returned for this query, and this reduced the duplication of effort across users. When visiting a page, users could also see which users had previously visited this page, and this information was also displayed in the results of a search, thereby enabling users to skip a page viewed by others. Users could also provide ratings for pages using a “thumbs-up” or “thumbs-down” metaphor. Support for division of labor was achieved through an embedded text chat facility, a recommendation mechanism and a split search and multi-search facility. Using split search, a user could divide the results of their search with a collaborating searcher and, using multi-search, a search query could be submitted to different search engines, each associated with different users. Persistence was achieved by allowing all parts of the system to be saved and re-used at a later date, including search queries and results, recommendation and chats.
The Adaptive Web Search (AWS) system proposed by Dalal (2007), represented a combination of personalized, social and collaborative search. The system was a type of meta-search system in which users could search for data or resources using multiple search engines, and maintain a preference vector for a particular engine based on their long-term and short-term search contexts, user goals and geographic location. Users could perform social searching by having their preference vector influenced by others depending on a level of trust.
An example of commercial application of synchronous collaborative information retrieval is available in the popular ‘Windows Live™ Messenger’ (2007), an instant messaging service. During a chat session, users can search together by having the results from a search displayed to each user involved in a chat. In a somewhat similar manner, Netscape Conferencer™ (2007) allows multiple users to browse the web together, using WYSIWIS views wherein one user controls the navigation and chat facilities, and wherein whiteboards facilitate communication.
Recent advances in ubiquitous computing devices such as mobile phones and personal digital assistants (‘PDAs’) have led researchers to begin exploring techniques for spontaneous collaborative search. The motivation behind these systems is that, as information and communications technology becomes ever-ubiquitous, meetings and/or goals between users can be enriched whenever users need to search for data or resources during the course of a meeting (for example, information in relation to a topic of conversation), since such a search can be performed collaboratively. Maekawa et al. (2006) developed a system for collaborative web browsing on mobile phones and PDAs, and WebSplitter (Han et al., 2000) was a similar system for providing partial views to web pages across a number of users and, potentially, across a number of devices available to a user (e.g. laptop, PDA).
From the above, it can be observed, that most of the development in synchronous collaborative information retrieval systems has concentrated upon improving group effectiveness, through providing awareness of other searchers' activities: this feature enables collaborating searchers to coordinate their activities, in order to support a division of labor and sharing of search knowledge amongst collaborators. Division of labor in these systems is generally achieved by either showing the pages visited or bookmarked by other users. The sharing of knowledge in these systems is generally supported by providing facilities for communications like chat systems and shared whiteboards for brainstorming.
A common feature of the above systems is that they all require users to explicitly log onto a service to support collaborative searching. However, systems have been developed, for making users who are browsing the web aware of others with similar information goals. The motivation behind the development of these systems is that, due to the ever-increasing number of people perusing the Internet, there is a high probability when searching the web for information, that another user is searching for the same information at the same time, whereby providing users with an awareness of others searching for the same information enables a spontaneous collaborative searching session, which can benefit both users.
Co-Located SCIR
Recently, the development of new computing devices has facilitated the development of co-located collaborative information retrieval tools. Particularly, advances in single display groupware (SDG) technology (Stewart et al., 1999), have enabled the development of collaborative search systems for the co-located environment. The main advantage of such systems is that they improve the awareness of collaborating searchers, by bringing them together in a face-to-face environment. Increased awareness can enable both a more effective division of labor and a greater sharing of knowledge. Single display groupware systems are gaining in popularity and, recently, Microsoft® developed a tabletop system labeled “Surface” (Microsoft Surface™, 2007) which will surely promote further exploration into this novel research area.
Let's Browse (Lieberman et al., 1999) was one of the earliest amongst such developments, and was a co-located web-browsing agent, which enabled multiple users standing in front of a screen (a display projected onto a wall) to browse the web together, based on their user profiles: a user profile in the system consisted of a set of weighted keywords (tf-idf weighting) of their interests and was built automatically from extracting keywords from both the user's homepage and those around it, using a breadth-first search. Users wore electronic badges so that they could be identified as they approached the screen. A collaborating group of users using Let's Browse were shown a set of recommended links to follow from the current page, ordered by their similarity to the aggregated users' profiles.
The tangible interface system developed by Blackwell et al. (2004), allowed a group of users to perform “Query-By-Argument”, whereby a series of physical tokens with RFID transmitters could be arranged on a table to develop a team's query. A team received a list of documents in response to a query and each member chose documents related to their interests. Users could highlight parts of the documents that were relevant, and this relevance feedback could be used to modify term weights for query expansion using Robertson's offer weight. In this way, the process of information retrieval became a by-product of interactions amongst users.
The TeamSearch system developed by Morris et al. (2006), enabled a group of users collaborating around an electronic tabletop to sift through a stack of pictures, using collaborative Boolean query formulation. The system enabled users to locate relevant pictures from a stack, by placing query tokens on special widgets, which corresponded to predefined metadata categories for the images. The TeamSearch system used, as its input device, a DiamondTouch electronic tabletop system developed by Mitsubishi Electric Research Labs (‘MERL’) (Dietz and Leigh, 2001), which is a multi-user touch-sensitive tabletop interface device enabling multiple users to sit around the device and interact with objects projected onto the table from an overhead projector, using their fingers.
DiamondSpin (Shen et al., 2004) is an interface toolkit, which enables development of applications on the DiamondTouch (or another tabletop device) and allows for objects on the screen to be moved, resized and rotated. Fischlár-DiamondTouch was a multi-user video search application developed at the Centre For Digital Video Processing at Dublin City University (Smeaton et al., 2006), which allowed two users to collaborate in a face-to-face manner in order to interact with a state of the art video retrieval application, Fischlár (Smeaton et al., 2001). Collaboration in Fischlár-DiamondTouch was mediated at the interface level through various awareness widgets, however the system still communicated with a standard single-user search engine. In an effort to improve collaborative search effectiveness, this system was further developed as “Cerchiamo” by the 2007 FXPAL TRECVid team (Adcock et al., 2007), wherein the two users would work together under respective, predefined roles of “prospector” and “miner”, for finding relevant shots of videos. The role of the prospector was to locate avenues for further exploration, while the role of the miner was to explore these avenues.
SCIR Parameter Sharing
Early SCIR systems provided various awareness cues: by providing these cues, these systems enabled the collaborating searchers themselves to coordinate their activities in order to achieve a certain division of labor and sharing of knowledge. However, coordinating activities amongst users can be troublesome by requiring too much of a user's cognitive load (Adcock et al., 2007). Recent systems support a more system-mediated division of labor thorough dividing the results of a search query amongst searchers (Morris and Horvitz, 2007) or defining searcher roles (Adcock et al., 2007). Sharing of knowledge in these systems is generally realized in the form of awareness cues to others, such as the bookmark lists that allow users, as they find relevant material, to store these for later consolidation and discussion amongst users. However, this information from other users' previous relevance judgments, is frequently used in asynchronous collaborative information retrieval to improve a new user's search, either through collaborative filtering, or community re-ranking. Synchronous collaborative information retrieval systems, rather than re-using this explicit relevance information in the search process, simply use it as a bookmark.
An example will make this current limitation clearer: suppose two users are searching together to satisfy a shared information need using a SCIR system described above. As user A finds documents which he believes are relevant to the task, he saves them to a “bookmarked” area so that user B can see these documents. What user A is doing is providing explicit relevance judgments to the search engine. Relevance judgments garnered from users can provide performance improvements by reformulating the query in order to reflect this extra relevance information. At present, SCIR systems do not use this new relevance information directly within the search process for re-formulating a user's query: instead, this new relevance information is used simply as a bookmark, i.e. a placeholder where users can save their results during a search. No attempt is made to utilize this relevance information during the course of a SCIR search to improve the quality of ranked lists returned to each collaborating searcher. As a consequence the collaborating group does not see the benefit of this explicit relevance information in their ranked lists.
Asynchronous systems rely on the building of large user-item matrices in order to generate predictions related to a long-standing information need. A critical mass of ratings is required for these systems to be effective, and ratings are made on items through a “voice of the masses” approach. Although asynchronous collaborative information retrieval systems support user collaboration, their focus is still very much in line with traditional information retrieval systems: they are motivated by improving a single user's search.
On the contrary, in a synchronous domain, collaboration is spontaneous and over a shorter period of time: therefore, a much smaller amount of relevance judgments are available in order to improve a search, and this relevance information needs to be re-used quickly in order to benefit each searcher before the search session ends. Such synchronous systems are more focused, users are explicitly searching together to satisfy a common information need, and there is therefore less need for a nearest-neighbor approach, since all collaborating searchers can be considered nearest-neighbors.
Synchronous Collaborative Information Retrieval systems, on the other hand, represent a significant shift in motivation from these traditional IR systems. The focus of these systems is not on supporting a single user in a search task, but actively supporting a group of users in a search task. This motivational shift requires a rethink in the techniques used. For example, whereas a collaborative filtering system attempts to recommend items to a user based on the fact that previous users have found the item relevant, a synchronous collaborative information retrieval system may, on the other hand, decide to remove this item from other searchers' retrieved results, in order to reduce redundancy and improve group effectiveness.
In the Let's Browse (Lieberman et al., 1999) system, multiple user profiles were aggregated in a browsing session: this information was however simply used as a means to select which pages to browse to next, and a user profile would be constructed off-line based on terms extracted from their homepage and surrounding pages, whereby not applicable to a real-time search session wherein user profiles can change depending on their current search.
The recently proposed approach by Adcock et al. (2007) divides the searching task for two co-located users into two specialized and complementary roles. Feedback from the user is used to influence results passed to them from the other. However in a distributed environment like the web this specialization may be difficult and furthermore the relevance assessments are not used directly in the search process but instead are used as a means to order results for presentation and for suggesting possible query terms.
A recently proposed approach by Dalal (2007) outlines how a user's personalized profile can be combined with others in order to support social searching. Dalal describes how a trust scalar can be employed to modify the influence of different users or groups on a user's preference vector. However at present this preference vector is used simply as a means to select a particular meta search engine to use (at present the system uses country-specific searches), and no details are given as to how a user's profile is constructed beyond that it consists of “short-term and long-term contexts”.
Improving the effectiveness of SCIR systems is known to be achievable through both the optimal division of a search task amongst collaborating users, wherein each user of a group performs a subset of the overall search task, and the optimal feedback to each user of the group, whereby group members may benefit from any relevant material found by others within the search process. Early SCIR systems were therefore focused on improving the awareness of each user in the group of the progress achieved by the other users.
A problem to be solved in SCIR systems is therefore how to allow two (or more) users to search effectively together, by having the search system continually making use of relevance judgments provided by each user in a single synchronised collaborative search session, so as to improve the quality of the respective search results for each of the two (or more) users in real time. Methods and systems, which can solve this problem, are therefore highly desirable, in particular as collaborative working environments are emerging wherein shared working activities are physically encouraged and allowed, for instance Microsoft's Surface which supports 2 or more people jointly interacting with a single terminal for shared of computing tasks, for instance the locating and retrieval of data or resources.
There is a need to develop effective techniques to exploit the relevance information provided by searchers during a synchronous collaborative information retrieval session.