Data warehouses are databases specially designed to support enterprise management information and analysis purposes. Enterprises are building increasingly larger data warehouses to enable analytics as a way to unlock the power of information and improve business performance. For instance, many call centers have deployed data warehousing and analytics solutions to identify hot issues and new self-service opportunities. Often, call center tickets are constantly being collected, ingested and processed into the data warehouses, to enable analytics. Throughout this data flow, some tickets may be updated, based on their status changes, e.g., from “open” ticket to “closed”, while others might simply be inserted as newly created tickets. However, current data warehousing technologies have significant drawbacks to support such data flows effectively.
For example, existing data warehousing technologies may not have native support for large volume updates into the data warehouse that are performed on a regular basis. Typically, database warehouse users must manually update changed records via low-level relational database management system (RDBMS) operations. Manual updates may work adequately if they are only occasional and the updates involve small amounts of data. For frequent updates of large amounts of data, manual updates may not work. A current solution is to refresh the entire data warehouse whenever large amount of updates need to be ingested. Clearly, this is not efficient when the update stream is large and incrementally growing, because every time, all the data must be reloaded from the very beginning.
Accordingly, there is a need for a system and method for efficiently performing large volume updates of data warehouses on a regular basis. There is also a need for a system and method of updating data warehouses that does not require the data warehouse to be rebuilt and refreshed from scratch.