Entity-centric models are traditionally built with strong reference to structured content: a database of people's personal details or of geographical information, for example. Representations of these entities are then simply a call-back to the structured content that was used to generate them, and comparisons between entities are simply comparisons of the various attributes in the database.
However, there exists a large amount of other interesting information relating to an entity in unstructured content (such as free-text data; for example a news story, a blog, or a press release) where that entity is mentioned. Further, information regarding the entities most closely related to a given entity can also be seen as an interesting property of that given entity.
As an example, consider Egypt—a country whose landmass, GBP, head of state, prime imports (in this case, wheat) and other such attributes are well known and available in structured data sets. By perusing free-text documents either published by Egyptian authorities, or documents that mention Egypt, further attributes may be discovered, such as Egypt's recent connection with Arabic states' civil unrest. Further, consider two companies that both depend on the price of wheat, but that are not directly related to one another: their common connection to Egypt creates a dependency between the two companies that may only be inferred through understanding their connections.
Accordingly, there exists a need for systems and techniques that address the need to represent entities by the unstructured content surrounding them, and information regarding the entities to which they are connected; and the corollary need to perform meaningful comparison between entities that may have no direct connection.