Repository systems can perform various functions including managing information about resources or objects (e.g., an application, a process, a service, or an endpoint) in a computing environment. Repository systems may perform correlation on stored information to determine relationships between resources or objects. Some repository systems may determine relationships to identify duplicate data that can be reduced or eliminated to reduce use of storage and/or to improve efficiency for retrieval of information.
The correlation methodologies implemented by some repository systems identify duplicate data by comparing each pair of resources or objects (e.g., pair-wise comparison). In a repository system storing information about many objects and/or resources, such techniques may demand the use of many computing resources to perform the comparison for each pair of resources or objects. Further, these correlation strategies are not capable of identifying meaningful relationships between groups of resources or objects.