The ability to monitor traffic is important for many business and government reasons. One especially important reason is to acquire underlying data necessary to understand human behavior when driving vehicles and the relationships between drivers and between drivers and the environment and both. The ultimate aim of data collection is allow analysis via replays or utilize other understanding approaches that enable subsequent improvements resulting in smoothing the flow of goods and services, expedite emergency vehicles and, in general, improve the quality of human life. The ability to monitor for, detect and report accurate information about events is essential for policymakers, traffic managers and drivers alike. In this regard, entire systems have been created to respond to this monitoring challenge. But these systems are severely constrained by the massive amounts of monitoring data needed for a complete picture of traffic activity. Today's systems are also hobbled by energy limitations, high costs, inability to effectively search video, inability to correlate multiple data inputs and the limiting reality of constrained multi-hop communications in sensor network deployments. Finally, the biggest problem of all is the inability of these systems to accurately replay traffic events or provide insight into future behavior as environment factors are changed, such as the weather, timing of stop lights, restricting vehicle sizes or widening the road.
The ongoing monitoring problem is the appetite for information is always growing, which is driving technology to evolve from simple scalar traffic counting to more complex traffic classification technology. Smart traffic guidance and management systems for smart cities are being flooded with monitoring data as part of the expanding role of IoT in everyday lives. Counting vehicles, people and things (products on assembly lines, pallets of goods, people in vehicles, types of vehicles, speeds, etc.) is emerging as a key class of “killer applications” that will greatly assist in spreading IoT deployments that monitor and gather data for future traffic applications. Today, the evolution of technology is also driving the desire to move beyond counting to automated monitoring for programmed events to eventually proactive usage that answers questions as to road capacity, what is the best speed limit, how many lanes are needed etc.
Today, it's amazing what a single camera can do with sophisticated traffic analyzing software driven by Artificial Intelligence (AI) algorithms. While AI algorithms have been around for a while they continue to evolve and are getting better at event detection and reporting activities. But the application of AI at the edge of the network in a distributed IoT sensor based environment is new and presents substantial challenges. Due to Moore's Law, processing has become very inexpensive compared to the cost of massive video transmission infrastructures supporting a large and growing number of video feeds. The key problem is to mine the information embedded in the video data where it is captured and to monitor for, and detect events which are programmed into the system, often remotely. Using an event detection approach is much more efficient and compact than needlessly moving raw video data around an energy and capacity constrained network. Instead, because of economic trends it is possible to attach programmable processors to the data collection points to reduce transmitted data volumes and manage the energy needed to power the network sensors/nodes to provide higher monitoring and event detection coverage.
Data collected by today's rudimentary sensor networks have simple scalar forms with less information which makes processing simple (typically simple calculations of addition, subtraction, division, sums, and averages for example are generated). But it is difficult to form a comprehensive understanding of an environment based on simple scalar information. On the other hand, videos and images collected by video sensor networks are rich in information. They are non-lossy and can be analyzed over and over, but have complicated forms. Video can be searched and analyzed in a refined and step wise manner looking for and drilling down into events. Today videos are sometimes compressed and sent to back-end servers for this processing (formatted, integrated, analyzed, etc.) to meet diverse application requirements. But simple compression is not enough for the upcoming flood of future data.
Increasingly, new applications cannot be supported by typical scalar sensor networks because they require vast amounts of information which can only be obtained from an image or video. Scalar data is insufficient for many important applications such as GIS, video surveillance, traffic monitoring and the combination of GIS and traffic monitoring information.
Clearly, the scalar approach of counting the events such as the passage of individual cars is not enough. This approach misses important data such as the relationships between cars, vehicle sizes, closing speeds, acceleration, weather conditions, road conditions etc. In addition, simple scalar data collection will not provide enough information to allow the accurate replaying of traffic activity with all the environment nuances for an end user.
Getting the proper meta data from video feeds is important for future applications that can combine virtual reality and artificial intelligence in the replaying of traffic activities over maps and pictures suitable for not only human consumption but provide the ability to change parameters that are critical to employing proactive traffic management.
The ability to accurately replay traffic events is very important to gauge the efficiency of our transportation infrastructure. But even more important is the ability to provide insight into future performance as behavior patterns and environment factors are changed, such as the weather, increased traffic from a future stadium, timing of stop lights, restricting vehicle sizes or widening the road. The amount of money needed for infrastructure changes is significant and pro-actively making sure the money that is to be spent achieves its intended purpose is paramount.
Due to the lack of flexible sensors, energy requirements, network connectivity and video platforms, there has been little progress on controlling a distributed system of IoT video sensors. Future systems need the flexibility to adjust camera aim, coordinate data from multiple cameras 2D to 3D to managing many camera angles and correlate the video data into a low bit rate data stream that is information rich enough to be useful with GIS applications that allow replay and “what-if” simulation scenarios. The ability to be proactive is hugely important.