The present disclosure relates to database frequency summarization, and more specifically, to techniques for summarizing the frequency of particular data among computers.
Data mining, a field at the intersection of computer science and statistics, is the process that attempts to discover patterns in large data sets. It utilizes methods at the intersection of artificial intelligence, machine learning, statistics, and database systems. The overall goal of the data mining process is to extract information from a data set and transform it into an understandable structure for further use. Aside from the raw analysis step, it involves database and data management aspects, data preprocessing, model and inference considerations, metrics, complexity considerations, post-processing of discovered structures, visualization, and online updating.
The actual data mining task is the automatic or semi-automatic analysis of large quantities of data to extract previously unknown interesting patterns such as groups of data records (cluster analysis), unusual records (anomaly detection), and dependencies (association rule mining), etc. This usually involves using database techniques such as spatial indexes. These patterns can then be seen as a kind of summary of the input data, and may be used in further analysis, or for example, in machine learning and predictive analytics.