The present invention generally relates to data analysis, and more particularly to determining an optimal set of analytics for a given data set.
Data analytics (DA) is the science of examining raw data with the purpose of drawing conclusions about that information. Data analytics is used in many industries to allow companies and organization to make better business decisions and in the sciences to verify or disprove existing models or theories. Data analytics is commonly used to aid in data mining depending on the scope, purpose and focus of the analysis. Data miners sort through huge data sets using sophisticated software to identify undiscovered patterns and establish hidden relationships. The data obtained may originate in an analogue paper system. Utilizing Optical Character Recognition, or OCR, systems can convert different types of documents, such as scanned paper files, into editable and searchable data.
As the data undergoing analysis increases, more resources need to be allocated. A subject matter expert, or SME, may interact directly with an analytics system, possibly through a simplified interface, or may codify domain knowledge for use by knowledge engineers or ontologists. An SME is also involved in validating the system results as desirable or optimal. Typically, an SME may manually apply analytics to a data set and may attempt to determine the efficiency of the analysis, which may be time consuming and cumbersome, leading to inefficiencies and extra cost.