Large data graphs store data and rules that describe knowledge about the data in a form that provides for deductive reasoning. For example, in a data graph, entities, such as people, places, things, concepts, etc., may be stored as nodes and the edges between nodes may indicate the relationship between the nodes. In such a data graph, the nodes “Maryland” and “United States” may be linked by the edges of in country and/or has state. The basic unit of such a data graph can be a tuple that includes two entities, a subject entity and an object entity, and a relationship between the entities. Tuples may represent a real-world fact, such as “Maryland is a state in the United States.” The tuple may also include other information, such as context information, statistical information, audit information, metadata about the edges, etc. However, the graph may be lacking information about some entities. These entities may be described in document sources, such as web pages, but adding the information for the entities manually is slow and does not scale. Such missing entities and their relationships to other entities reduce the usefulness of querying the data graph.