Modern computing often uses a large number of data sets, whether text files, images, or other formats (such as portable document format (pdf), Microsoft Word® format, Microsoft Excel® format or the like). However, it is difficult and costly to maintain and store these data sets in a meaningful fashion. Indeed, conventionally, many data sets are lost on company-wide systems due to inability to effectively find and use sets, especially when data sets are dumped into a data lake rather than indexed and stored.
Moreover, traditional mechanisms of indexing data sets generally focus on the data sets themselves. However, this may limit the type of dimensions used to index, compare, and search the data sets. Embodiments of the present disclosure may solve these technical problems.