This relates to development of information from traffic data of a wireless network, where the data includes events that are outside everyday network load. This information assists in anticipating future events and their geographical impact, and thereby assists in provisioning network capacity.
With the fast development of mobility network technology, the network traffic has increased significantly. To improve the mobility network performance, service providers have invested significant resources to improve the coverage, enhance the quality, and increase the capacity. To illustrate, AT&T has invested more than $1.5 billion from 2007 to 2009 in California alone, and a Verizon-led investment group is committing to invest $1.3 billion in wireless long term evolution development.
Differentiated from wireline network, wireless network quality is relatively dynamic. It is impacted by the nature of the network's use (e.g., time spent by users to download data from the Internet), retransmit rates that are affected by signal to noise ratios, and by the nature of cell phone use. The easy-to-carry mobile phones are much more engaged with human activities than the wired phones, and hence the network traffic is heavily influenced by what people do. A very significant component in the variability of the wireless network's load and the perceived quality of service is social events. At large social events many cell phone users gather in a small area, such as a sports or concert venue, and—unless some provision is made—that causes network capability to overflow. To illustrate how significant an effect an event can have, it is noted that Super Bowl XLIV, for example, which was held at Sun Life Stadium (Miami Garden, Fla.), attracted about 75,000 people, where the normal population for Miami Garden is a bit over 100,000. The call traffic increase is probably much higher than the 175% population increase, and such an increase is not something that the wireless network is typically designed (or should be expected) to handle.
Clearly, it is important to anticipate events. Events such as the Super Bowl are easy to anticipate because they are scheduled months in advance, but there are many lesser events that cannot be easily anticipated because they are not scheduled well in advance. One way to anticipate events is to be aware of past events, and to predict future events based on the past events. Although many events can be accounted for from data other than actual network traffic data, a much more complete picture can be had by detecting events from the network data itself.
From a statistical point of view, in general, there are three types of event detection methods: outlier/change point based method, pattern based method, and model based method. For the model based event detection approach (which underlies the approach of this invention) different models have been constructed based on the characteristics of the measured data. For example, the Dynamic Bayesian Networks (DBNs) approach has been applied to detect abnormal events in underground coal mines, Markov random fields (MRFs) have been used to model spatial relationships at neighboring sensor nodes, and the Hidden Markov Model has been used on fMRI (functional Magnetic Resonance Imaging) to detect activation areas. Ihler et al in “Adaptive Event Detection With Time-Varying Poisson Processes,” in KDD '06, Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, New York, N.Y., ACM, 2006, pp. 207-216, utilize a Markov Model Modulated Nonhomogeneous Poisson Process to detect events from highway traffic data collected by one sensor at a specific location.
In recent years, scan statistics have been a hot topic in spatial analysis and nowadays it appears to be the most effective “hotspots” detection method. The scan statistics are used to test a point process to see if it is purely random, or if any clusters of events are present. There are numerous variations of spatial scan statistics, but they share the three basic properties: the geometry of the scanned area, the probability distributions generating events under null hypothesis, and the shapes and sizes of the scanning window. The spatial scan statistics measure the log-likelihood ratio for a particular region to test spatial randomness. The region with the largest spatial scan statistic is the most likely to be generated by a different distribution. By extending the scan window from circular to cylindrical, the scan statistic extends from spatial domain to spatiotemporal domain. Scan statistics assumes that the null hypothesis is known or can be estimated through Monte Carlo simulation; but the null hypothesis assumption is often invalid and, moreover, the Monte Carlo simulation is computationally expensive.