1. Field
The present invention relates generally to predictive information retrieval system and, more specifically, to scalable complex event processing with probabilistic machine learning models to predict subsequent geolocations of individuals.
2. Description of the Related Art
Predictive information retrieval systems provide information to users without being asked. These systems attempt to predict which content a user would like to access, thereby relieving the user of the burden of specifying the content and helping users discover new content. For example, some existing systems attempt to predict new stories in which a user would be interested in advance of the user expressing such an interest. Using these predictions, some information retrieval systems predictively cache and display content on a user's computing device. As a result, the user may view, for instance, a list of notifications containing the content, and select among content with relatively little effort.
Often, a user's current and recent geolocation history provides a relatively strong signal regarding the type of content likely to be of interest to the user. Leveraging this information to provide relevant content predictably is often challenging with many traditional computer systems, as the number of dimensions to be considered can be relatively large over larger geographic areas, like an entire city, state, country, or continent, and the relevance of results often decays relatively rapidly as users move to other locations. Naïve, brute-force techniques for processing geolocation events with general-purpose computers are often slow and unreliable. Some systems attempt to expedite results by reducing the number of dimensions to be evaluated with hand-coded rules or patterns, but such systems can be exceedingly difficult and expensive to configure, as manual rule construction often requires excessive generalization, and can rely on rules that quickly become out of date. Further, individual general purpose computers are often incapable of processing rich pattern sets with sufficient latency in commercial applications, where the number of patterns may extend into the hundreds of thousands, the number of users may be in the millions, and the number of reported geolocations warranting processing against the rules may exceed several thousand per hour. Thus, many existing predictive information retrieval systems are not well-suited for certain types of large-scale, real-time local search problems.