Organizations that manage human resources, benefits, financial services, or other services that maintain user data or accounts often provide multiple portals through which users may access, modify, or otherwise interact with their data. For example, a lender that manages student loan accounts may provide a website through which users can view the balances or make payments on their loans. The lender may further maintain a call center to allow users to ask specific questions to customer service representatives or to make similar balance inquiries or payment transactions. In addition, the lender may provide a chat portal through which users can chat with customer service representatives by instant messaging, an email address to which inquires may be sent, a traditional mailing address for receiving paper letters, and other portals. Given recent advances in mobile technologies, the lender may also provide a mobile telephone or tablet application that users may use to perform these and similar tasks.
Often, there is an inverse relationship between a service provider's preference and the user's preference for whether a given portal is used. For example, whereas users generally prefer to have their questions answered by a customer service representative over the phone, service providers generally prefer to address user questions using portals that incur lower operational costs, such as static frequently asked questions (FAQ) webpages or email responses.
One way to minimize the use of more expensive portals without directly restricting users' options is to preemptively “push” information to users, or take other preemptive action, before users might otherwise call a call center or initiate a chat session to ask for such information. Pushing information to a user might involve sending a letter by mail, sending an email, or directing the user's browser to a webpage in response to a transaction.
However, in order to determine what kinds of information should be pushed to users and when such information should be pushed, service providers need to be able to discern common user patterns or “episodes.” For example, it may be valuable to know that 90% of users who have read a particular FAQ on a website do not place a phone call to the call center to inquire about a particular topic. Or it may be valuable to know that 75% of users who pay off a loan communicate a request (e.g., by email, mail, chat, etc.) to the lender within 10 days requesting a formal letter from the lender acknowledging payment in full of the loan. By learning such information, the lender can adjust its procedures to, for example, always send a payoff letter within 2 business days of a loan being paid off, and thus to avoid the need to handle another user inquiry. Service providers may be interested in determining common event patterns for many additional reasons, such as determining whether a particular program or service has been, effective, directing user inquiries to appropriate personnel, and providing useful statistics about user behavior.
Typically, service providers attempt to piece this kind of information together by maintaining logs for one or more of their portals. For example, a service provider may maintain a webpage request log that records hypertext transfer protocol (HTTP) requests for particular webpages by particular users. Similarly, a call center log may store records reflecting specific interactions that a user has with a call center, such as initiation of a call, various interactive voice response (IVR) options that the user selects, questions or comments that the user makes to a customer service representative, termination of the call, etc. Similar logs may exist for a service provider's chat portal, email portal, mobile application portal, etc.
However, traditional approaches to discovering event patterns in portal logs tend to be very computationally expensive. For example, apriori methods typically operate by iteratively scanning all event sequences and joining each frequent episode with all other episodes to build (k+1)-length episode candidates. Such methods therefore have complexity O (n2), where n is the number of k-length frequent episodes. Thus, apriori and other scanning methods become infeasible when the size of a given log file becomes very large.
Traditional scanning approaches are also of limited utility in that they are domain-specific. Failure to detect frequent episodes that span multiple domains—e.g., determining that a particular webpage is frequently accessed after a particular transaction is performed on a mobile application—therefore seriously restricts the range of patterns detectable by traditional scanning approaches.
Accordingly, there is a need for methods and systems for detecting frequent episodes across multiple domains in a manner that is both computationally and memory efficient.