The present invention relates generally to the field of database data analytics and more particularly to determining statistically similar related data in a structured dataset pool.
Data analytics focuses on searching data files to discover business insights. When target data (core data) and specified target field(s) are selected for data mining and analysis there may be additional relevant information in other related datasets beyond the target data dataset.
Analyzing datasets for relevant data of interest can be a manual effort of investigating each possible related dataset for relevancy by comparing each data with the core data and joining them together as a single source for the subsequent data analysis. With massive datasets, data information is stored in various sources and the ability to discover relevant target data with other datasets becomes complex as the quantity of data sources and/or datasets increases.