Many new computing applications involve the generation and transmission of data from a group of sensor devices to a remote sink node, such as, for example a pervasive device, where such data is aggregated and analyzed. Such applications are becoming common in a variety of remote monitoring scenarios, such as healthcare, where wearable sensors record and transmit various biometric measures of an individual, vehicular telematics, where on-board sensors measure various vehicular parameters and transmit them back to a central diagnostic server, and intelligent transportation systems, where highway sensors periodically record traffic conditions.
Such data gathering systems have important goals or concerns. First, as many of these sensor devices are resource-constrained themselves, typically operating on batteries, the system should minimize the communication and/or the data collection overhead, helping to reduce the energy expenditure of such devices. Second, many of these devices are not just reporting nodes, but also possess a fair degree of processing power and local intelligence. Architecturally, such data collection systems comprise a set of client sensor devices that are connected, often using a wireless communications infrastructure, to the pervasive device, wherein this pervasive device is a part of an existing information technology infrastructure.
A pervasive device and a backend server may have an intermittent connection and multiple interfaces, each with different energy consumption profile and cost, such as, for example 802.11, GPRS and bluetooth. Each interface also exhibits a different temporal variation in connectivity due to the mobility of the user, which may include, for example, the chosen provider, the link rate, and the transmission power needed. Thus, the most effective method of transmitting the data between a pervasive device and a back end server involves answering the questions: how much data, what data and when to transmit?
One simple form of improving the efficiency of the data gathering system is to compress the sensor data prior to transmission. Another important technique for improving the efficiency of the data transmission is to perform data filtering. Data filtering refers to the idea that much of the data may be eliminated or reduced if it is not necessary to the end goals of the infrastructure. There are a wide variety of compression schemes available, such as Huffman, Vector Quantization (VQ), Lempel-Ziv (LZ) and run-length coding. In general, different compression algorithms are applicable to specific types of data sources. Different data sources possess different “statistical parameters” and different algorithms work better for different families of statistics. In addition, the choice of a compression algorithm is also determined by the application's requirement on the quality of the compression.
Another form of known technique for conserving data is to simply use the best connected interface at any given instant, or use data striping to concurrently transmit subsets of data across multiple physical or logical interfaces. However, the big drawback of this solution is that it fails to account for the predictive behavior in network connectivity often exhibited by individuals. More specifically, the pervasive device may consider the probabilities associated without considering connection qualities at future time instants. However, one additional possibility is to have the device consider the probabilities associated with the connection quality on its different interfaces at future time constraints. Thus, if the pervasive device knew that it would almost surely be in contact with a “free” 802.11 WLAN in 1 hour, it might choose to cache its data locally for the upcoming hour, rather than instantaneously relaying it via the currently available GPRS interface, which may have associated “data charges”.
One area closely related to transmissions scheduling is “Delay Tolerant Networks” (DTN), where individual devices have significantly large periods of network disconnection, and where the data is typically relayed to the backend server in a multi-hop fashion using a variety of probabilistic relaying techniques. DTN protocols do exhibit the behavior of using likelihoods of future connectivity in determining whether a packet is to be stored locally or forwarded to another pervasive device, which has its own likelihood of being connected to the backend infrastructure. However, these protocols are typically designed for groups of nodes, rather than for the relaying of data from multiple correlated streams by a single node. In particular, these protocols do not factor in the likelihood of data generation by the associated sensors, principally because DTN environments typically have no predictability or knowledge of the data generation patterns.
It would be desirable for the system to allow the sensor devices or any relaying device the capability to dynamically modify the compression technique and transmission strategy and schedule, especially when multiple interfaces are present, over one or more raw or filtered data stream.