There is an ongoing movement in the digital era towards user-generated online content. Whereas, in the recent past, information was generated by a content provider associated with a network provider, and disseminated to users via the network (such as, for instance, the Internet), modern devices and applications thereon enable users to generate and share content in far greater volumes than content providers. For example, real-time social network is enabling users to share their experiences related to a wide variety of topics, such as entertainment (music and film reviews), transit (road conditions, traffic incidents), weather, health, municipal issues (flooding, power loss), etc. In other words, everyday end-users are participating directly in generating useful content that can be used to perform remedial measures, such as repairs, traffic management, deployment of service vehicles, etc.
Various systems and platforms are available for users to generate and publish content. Smart devices with location capabilities and broadband Internet capabilities are used to take pictures, add locations, type text, and upload picture/text messages. Social media platforms/websites (such as Facebook, Twitter, etc.), online blogs, web forums, among others, provide hosting capabilities for user-generated content. The user-generated content may be shared across various communities based on social connection, location, and even worldwide. Users can therefore utilize such platforms to voice their concerns related to various topics, and discuss desired remedies. Moreover, such concerns are of immense value to stakeholders who provide services to the users, such as state and local governments, municipal agencies, police and emergency departments, and other entities. Therefore, a proper and efficient analysis of such user-generated online content is of great importance in order to identify and resolve user's grievances efficiently and in a timely fashion.
However, owing to the enormity and the dynamism of user-generated content sourced from various sources, gaining knowledge about the pressing issues related to specific topics identified above (and others) is not a trivial task. This is particularly true when prior information associated with the content (i.e. context) is almost always absent, and there is usually no method to ascertain or verify the accuracy of the determined topics. Moreover, many issues discussed online are multi-dimensional, in the sense that they are contextualized with respect to multiple topic or sub-topics. Further, identifying issues or problems from the user-generated content is quite different from the problem of event detection in which burstiness is usually the single most important characteristic of the data and the time window for determination is typically short. In contrast, the topics identified in user-generated content are continuously occurring and are more evenly distributed across time. In other words, the presumption about detecting events is that they generally do not occur, whereas the issues present in user-generated content are more likely to be recurring or persistent.
In addition, certain issues associated with a topic and having relevance to a specific community or location may not be relevant to a different community or location. For example, a first location may be prone to traffic congestion based on increased vehicular traffic in the first location, whereas a second condition may be a low-lying region that suffers from water logging issues. A topic that includes an issue or a problem that can be Moreover, these relationships vary over time, which increases the difficulty in performing analysis of this information. Difficulties in ascertaining relevant topics and issues arising therefrom and associated with specific locations and communities in real-time can result in additional difficulties in solving these issues. For example, the rate of deployment of service vehicles is hampered by an inability of a municipal agency to identify the specific service provider that deploys the service vehicles. Although agencies and providers are increasingly connected via networks, determining the appropriate network (or subnetwork) associated with a specific service provider remains challenging, at least due to the difficulty in determining a topic based on the user-generated content, and identifying issues therefrom.