1. Technical Field
The disclosed technology relates to the field of sensemaking.
2. Background Art
Knowledge workers such as scientists, attorneys, intelligence analysts, private and public investigators/detectives and financial analysts all perform tasks that require reading and synthesizing information from many documents. In such tasks, there is more information than a worker can hold in mind, so an essential element of the task is to record some of what has been learned in written or electronic form.
A knowledge worker often needs to track more information than can be held in human memory. As a result, the knowledge worker generally uses an evidence file or notebook to record relevant information by storing entities and hand-typed notes about the information. The captured information generally includes important relationships between the entities, between entities and other relationships, and between relationships.
A computer can be used to add value to the notes. For example, the knowledge worker can use full text search to locate a note (if he/she remembers words used in the note). In addition, if the notes include hypertext links, the worker can also use the links to re-find documents that have been previously read. However, the available computer assistance is limited because the computer does not have access to information about the relationships described in the documents, the relationships between those relationships, nor about which of the relationships are of greater or lesser interest to the knowledge worker. In addition, while a computer can search for text strings entered by the knowledge worker, it is unable to distinguish between text-snippets that are of interest to the knowledge worker and those that are not. Furthermore, the detailed note-taking process is extremely time-consuming and often the evidence filed does not include enough information to allow computerized assistance.
The disclosed technology builds on work related to recording evidence, spatial hypertext, automatic highlighting, automating inference, reading recommendations, and reading through multiple documents.
The disclosed technology differs from the Sandbox component of Oculus nSpace (Wright et al., Advances in nSpace—the sandbox for analysis. Poster at the 2005 International Conference on Intelligence Analysis) in that technology disclosed herein allows the knowledge worker to identify and record specific entities and relationships from documents as well as human-readable entities, and also allows the knowledge worker to associate a degree-of-interest value with each entity.
Single-mode snap-together operations have been used in the Niagara system (see: Good, L. E., Zoomable User Interfaces for the Authoring and Delivery of Slide Presentations. PhD dissertation, Department of Computer Science, University of Maryland, Oct. 27, 2003). In Niagara, the knowledge worker can group text snippets by moving them close together. The technology disclosed herein extends this approach by supporting two different kinds of grouping that result, respectively, from moving objects close together in vertical or horizontal directions, and by building a representation of all the entities and their relationships in the workspace.
Systems exist that employ automatic highlighting of text to aid reading and skimming. For example, the Scent Highlights component of the 3Book system automatically highlights words related to a query and sentences containing them to direct the reader's attention during skimming. Likewise, the Reader's Helper (see: Graham, J. The Reader's Helper: a personalized document reading environment. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '99), 1999, pages 481-488) highlights phrases judged to be similar to a reader's topic of interest. The technology disclosed herein extends this approach by highlighting both automatically-extracted entities and also phrases that have been given a high degree of interest rating by the knowledge worker.
Systems exist that automate the process of making inferences for intelligence analysis by using subgraph isomorphism to find suspicious patterns in a graph of entities and relationships (see: Coffman et al., Graph-based technologies for intelligence analysis. Communications of the ACM, Volume 47, Number 3 (March 2004), 45-47). SRI's Link Analysis Workbench (see: Wolverton et al. LAW: A workbench for approximate pattern matching in relational data. In The Fifteenth Innovative Applications of Artificial Intelligence Conference (IAAI-03), 2003) searches for entities in a graph that match a pattern of suspicious behavior either exactly or approximately. By contrast to these automated approaches, the disclosed technology provides interface tools for the knowledge workers to directly aid inference, based on whatever information the knowledge worker is viewing at any given moment.
Systems exist to assist a reader in selecting which document of a document collection is to be analyzed next. For example, Woodruff et al. in Enhancing a Digital Book with a Reading Recommender (CHI 2000) described a Reading Recommender that analyses the relationships based on textual similarity and co-citation between a set of documents and a list of documents read so far, and recommends new documents to examine. Bier in A document corpus browser for in-depth reading. Proceedings of the Joint Conference on Digital Libraries (JCDL), 2004, 87-96 discloses a visualization showing at a glance the most highly rated unread documents, which act as an implicit recommendation. The disclosed technology builds on these approaches in at least two ways. First, because the knowledge worker assigns degree-of-interest values to individual entities, recommendations are based on a relatively rich model of the knowledge worker's interests. Second, the disclosed technology recommends both documents to read and also specific relationships/entities to learn more about.
Systems exist for reading through a “trail” of documents (see: Bush, V, “As We May Think.” The Atlantic Monthly, July 1945. Reprinted in Interactions, 3(2), 1996, pages 35-67). The technology disclosed herein provides a visualization of a set of trails, each of which corresponds to a query about an entity or set of entities.
The Oculus TRIST system (see: Jonker et al, Information triage with TRIST. 2005 International Conference on Intelligence Analysis), like the disclosed technology, shows an icon per document and uses graphical presentation to distinguish read and un-read documents. Trails presented by the technology disclosed herein differ from TRIST in that the trails are automatically created responsive to the knowledge worker's manipulations within the workspace window.
It would be advantageous to enable the knowledge worker to quickly identify particular phrases within a passage that correspond to important people, things, actions, or world events etc. and to provide a degree-of-interest value to these phrases. It would also be advantageous to suggest which electronic documents in a document collection to analyze based on the knowledge worker's apparent interest as determined from entities and their relationships and to assist the knowledge worker when making inferences.