Individuals and organizations often store data within data warehouses that act as central repositories for data from a variety of different data sources. Due to their scale, data warehouses may facilitate the analysis and identification of trends over time —a technique known as “data mining.” To facilitate data mining, data warehouses may store data within a hierarchy of specialized tables, such as fact tables and dimension tables. Fact tables may include information about explicit measurements, such as the price of a sale. Dimension tables, in contrast, may depend from fact tables (or other dimension tables) and include attribute information about entities related to the fact tables.
Unfortunately, hackers, criminals, or fraudsters often generate fake, simulated, or fraudulent data. As such, administrators may wish to evaluate the quality of new or incoming data to prevent low quality or simulated data from being stored within data warehouses. However, conventional methods for detecting low quality or simulated data may require manual intervention and/or suffer from excessive delay or accuracy issues. These conventional methods may also fail to leverage or account for attribute information in dimension tables and/or fail to select or consider the best or most appropriate method for evaluating incoming data when multiple methods are available.
Accordingly, the instant disclosure identifies and addresses a need for additional and improved systems and methods for mining data in a data warehouse.