Data collection and analysis is rapidly changing the way scientific, national security, and business communities operate. Scientists, investigators, and analysts have an increasing need to discover the complex relationships among disparate data sources.
Most database systems are based on the relational data model. Relational databases store data in tables and process queries as a set of select and join operations on those tables. These systems are ineffective at discovering complex relationships in heterogeneous data as they are not designed to support subgraph isomorphism, typed path traversal, and community detection. Adding new relationships (e.g., another column) is difficult, usually requiring significant restructuring of internal data structures. If all records —rows of the table—do not have an entry for the new column, space is wasted. Outer join operations can generate large numbers of intermediate values that are later discarded, wasting both time and space.