Contact centers, such as Automatic Call Distribution or ACD systems, are employed by many enterprises to service customer contacts. A typical contact center includes a switch and/or server to receive and route incoming packet-switched and/or circuit-switched contacts and one or more resources, such as human agents and automated resources (e.g., Interactive Voice Response (IVR) units), to service the incoming contacts. Contact centers distribute contacts, whether inbound or outbound, for servicing to any suitable resource according to predefined criteria. In many existing systems, the criteria for servicing the contact from the moment that the contact center becomes aware of the contact until the contact is connected to an agent are customer-specifiable (i.e., programmable by the operator of the contact center), via a capability called vectoring. Normally in present-day ACDs when the ACD system's controller detects that an agent has become available to handle a contact, the controller identifies all predefined contact-handling skills of the agent (usually in some order of priority) and delivers to the agent the highest-priority oldest contact that matches the agent's highest-priority skill. Generally, the only condition that results in a contact not being delivered to an available agent is that there are no contacts waiting to be handled.
The primary objective of contact center management, including call-distribution algorithms, is to ultimately maximize contact center performance and profitability. An ongoing challenge in contact center administration is monitoring of agent behaviors to optimize the use of contact center resources and maximize agent performance and profitably. Current products for monitoring and reporting on contact center performance, such as Call Management System or CMS™ by Avaya, Inc., are configured as data warehouses that extract data from multiple sources, transform the data into a normalized form, and load the data into the data warehouse database, typically on a batch schedule. Additional calculations and reporting are performed after the batch load.
A common type of data warehouse is based on dimensional modeling. Dimensional modeling is a data model that divides the world into measurements and context. Measurements are usually numeric and taken repeatedly. Numeric measurements are facts. Facts are surrounded by textual context in existence when the fact is recorded. Context is often subdivided into dimensions. Fact tables are used in dimensional modeling to logically model measurements with multiple foreign keys referring to the contextual entities. The contextual entities each have an associated primary key. A “key” is a data element (e.g., attribute or column) that identifies an instance of an entity or record in a collection of data, such as a table. A “primary key” is a column or combination of columns whose values uniquely identify a row in a table or is the attribute or group of attributes selected from the candidate keys as the most suitable to uniquely identify each instance of an entity. A “foreign key” refers to a column or combination of columns whose values are required to match a primary key in another table or is a primary key of a parent entity that contributes to a child entity across a relationship. Types of primary keys include a natural key, or a key having a meaning to users, and a surrogate key, or a key that is artificially or synthetically established, meaningless to users, and used as a substitute for a natural key.
If the same entity (e.g., agent) is represented on multiple data sources (e.g., inbound call system and outbound call system) by different natural keys, a traditional data warehouse generates and assigns a surrogate key to identify the entity. The surrogate key is an internal identifier managed by the data warehouse. For example, in a contact center an agent may handle inbound calls from one system and outbound calls from another system, with different identities on each system. Data warehouses commonly process each data source independently, performing data correlation across sources at a later time. This approach is normally unworkable when events for the same entity from multiple sources must be processed simultaneously in real time, such as in the blended inbound/outbound call center. For this reason, existing contact center data warehouse products that combine data from multiple sources appear to merely process each source independently, with little or no correlation when the same entity is represented on multiple sources.
Some data models specify a behavior known as a type 2 slowly changing dimension. A type 2 dimension tracks the history of changes to an entity over time. When an attribute of an entity is changed, such as when a contact center agent changes their skill set or group membership, a new surrogate key for that entity is generated, and a new row inserted into the database. Fact data associated with the entity can now be tracked separately for activities that occurred before versus after the change by referencing the appropriate surrogate key.
The traditional technique for handling a type 2 dimension update is to associate the change with a specific point in time. If that point in time occurs in the middle of a logical transaction, there is a potential for performing incorrect data correlation. This problem is exacerbated for real time data warehouse application where the application must handle entities with multiple natural keys that also have multiple surrogate keys due to type 2 dimensions. Because an entity has multiple natural keys, a surrogate key must be used to track fact data for the entity. If the application performs calculations in real time that span a type 2 dimension change that caused a new surrogate key to be created, the result of the calculation may be indeterminate or incorrect. For example, a contact center application may track the amount of time an agent places callers on hold. If a type 2 dimension change occurs to the agent dimension while the agent has a caller on hold, a new surrogate key is generated. The application then cannot easily calculate the hold time because the start and end times are associated with different fact records with different surrogate keys. The problem expands when considering data consistency across individual calls, multiple related calls, or agent login sessions that may span hours.