1. Field of the Invention
The present invention relates generally to computer and communications networks, and more specifically to the emerging field of “sensor networks”, typically consisting of a large number of individual “sensor” devices, which send back samples of some environmental state to certain designated nodes.
2. Description of the Related Art
Sensor networks are an emerging and very promising new category of computer networks, characterized by the development of very-low cost sensor devices with combined sensing and communication (often wireless) capabilities. Most applications of sensor networks rely on combining information from multiple sensor devices to establish or infer some composite state or event of the sensed environment (often called the “sensing field”). Examples include the desire to “compute an average of readings taken by 10 different temperature sensors,” or to “compute the minimum out of 30 distinct humidity sensor readings.” Scalability is a major technical challenge in such future networks, which are projected to contain thousands, and possibly hundreds of thousands, of tiny sensor nodes deployed fairly densely over the sensing field. A key characteristic of many such operating environments is that such networks often exhibit significant redundancy, in that most applications do not normally require the use of sensor data from all of the available sensor nodes. Indeed, the physical density of sensor networks may often vary greatly over the sensing field due to choice (e.g., network designers may deploy more nodes in an area where finer location accuracy is required), lack of precise control (e.g., when nodes over a remote terrain are deployed by being dropped from an airplane), or failures (e.g., when sets of sensor nodes turn out to be defective or die due to exhaustion of battery resources). Clearly, if the requirements of the application can be met by activating only a subset of the available nodes, significant energy savings in the operation of the network can be obtained. In sensor network environments, where the energy resource of a node is often battery-based, judicious activation of the appropriate fraction of available nodes can significantly extend the overall operational “lifetime” of the network.
The notion of redundant deployment can manifest itself in forms other than simply the number of nodes of a designated type. For example, one can imagine a remote monitoring scenario involving an array of video cameras with zoomable lenses, placed one foot apart. In normal mode, it may be adequate to operate only one out of 50 cameras, with each activated camera operating at low zoom and covering a wide 3-D angular cone. Subsequently, if the application detects some potential motion via any one camera (using sophisticated image analysis and motion detection algorithms), the application may then request every alternate camera in the corresponding region to be activated, such that each camera operates at relatively high zoom and provides a very high resolution image for further analysis (e.g., to detect whether the motion is due to a wild animal or a malicious intruder). The important conclusion from these examples is that redundancy may occur not just in terms of the raw number of sensor nodes activated, but also in terms of various operational attributes of the activated nodes (e.g., the zoom factor of the video sensors). Significant benefits, such as increased operational lifetime and decreased traffic load, accrue even in cases where all nodes are “active”, but operating only at the “level” required to meet the application's “quality of information” (QoI) requirement. For example, another alternative to the monitoring scenario mentioned above would be to have all cameras activated, with each camera only transmitting, say 1/10th of its composed image (corresponding to a small portion of the total area captured by its lens), to a central processing facility. In this manner, by saving on the cost of transferring high-bandwidth images over the wireless channel, individual sensor nodes may realize significant energy savings.
The main challenge in supporting such energy-efficient operation is that, while applications may be able to express the amount of resources (such as the number or resolution of the sensor nodes), they typically have no idea of the actual number or layout of the sensors deployed. Accordingly, an application may not be able to decide on the appropriate settings (e.g., on or off, high or low zoom) for each individual sensor. The current practice for solving similar parameter configuration problems in conventional networks (wired or wireless) is to assume that there is a central database that contains an up-to-date information network identifier, as well as relevant attributes (such as location, degree of zoom, etc.), of each deployed sensor node. Assuming that such a central repository is available, then there are a variety of state-of-the art algorithms (based on techniques such as maximum set cover heuristics, etc.) for computing exactly which resources should be activated, as well as the appropriate setting of the parameter on these resources. After the settings are computed, they are communicated to the targeted network node, by using the explicit network identifier (e.g., IP address or DNS host name) of the network node.
However, such a mechanism is not useful in future sensor network environments for two distinct reasons. First, the approach requires a central repository (or an intermediate middleware layer) to be aware of the precise topology of the network, and the identifiers of each of the sensor nodes. More importantly, to ensure correct configuration of individual nodes, the repository must be continually updated of dynamic changes to the network topology or node properties. For example, the sensor network substrate can be modified either due to the occasional addition of new nodes, the death or removal of existing nodes and due to other unforeseen reasons (such as catastrophic node failures). This imposes a substantial reporting overhead and cannot scale to large sensor network environments, since every such change in the network topology must be propagated to the database. If a static configuration scheme is employed instead, it may quickly become inappropriate for the given operating environment. For example, suppose that 100 temperature nodes are specifically activated to meet the application's QoI requirement. However, if subsequently 50 nodes suffer a catastrophic failure (e.g., due to a natural disaster), the application will be left only with data from the remaining 50 nodes. In other words, the QoI obtained has become closely coupled to the dynamics of the underlying sensor network.
Second, this approach assumes that each sensor node is individually addressable, so that appropriate configuration parameters may be transmitted to it. In reality, many of the sensor nodes will not have the software stack typically associated with more powerful networked devices (such as cell phones or laptops). In many cases, for example, the nodes may not have IP addresses and may not be individually addressable. Indeed, there is a very active body of research in the sensor network academic community centered on forms of content-based addressing. In this approach, commands or queries are issued to groups of sensors possessing appropriate attribute values (e.g., a query for all sensors whose type=“temperature” and location=“cityA”). The query is typically propagated over a region of interest, and processed by all sensors in that region that have attributes satisfying the predicates of the query.
Recent work in the area of sensor networks (e.g., C. Intanagonwiwat, R. Govindan, and D. Estring, “Directed Diffusion: A Scalable and Robust Communication Paradigm for Sensor Networks”, Proceedings of ACM MOBICOM, 2000 (hereinafter “Directed Diffusion”); W. Heinzelman, J. Kulik, and H. Balakrishnan, “Adaptive Protocols for Information Dissemination in Wireless Sensor Networks”, Proceedings of 5th ACM/IEEE MOBICOM Conference, Seattle, Wash., August, 1999; N. Jiang, C. Schmidt, V. Matossian, and M. Parashar, “Enabling Applications in Sensor-based Pervasive Environments”, Proceedings of 1st Workshop on Broadband Sensor Networks (BaseNets), Oct. 29, 2004 (hereinafter “Associative Rendezvous”); etc.) has focused on ways to efficiently propagate the query over the appropriately defined sub-region of the sensing field, rather than the simple-minded approach of broadcasting it over the sensing field. Although all of these techniques define “how” some queries or requests are propagated over the sensing field, these techniques do not define “what” is propagated or how individual nodes respond to the “what” they receive. Moreover, they do not define or provide an adaptive method by which the “what” (i.e., the content) may be used to continually meet the QoI requirements of the application, even though the topology and other physical properties of the sensor network changes.
As previously mentioned, the idea of using broadcasts or “directed broadcasts” to communicate implicitly with a group of computer nodes has been presented in research literature in several forms. Directed Diffusion (DD) presents one example of directed broadcast, where the query or application request for data is initially broadcast over the entire network. Subsequently, some of the paths are reinforced and others pruned to establish a reasonably optimal delivery path from the source of the data to the sink (the gateway node that issued the query). In directed diffusion, the query is issued without being explicitly aware of the identity of sensor nodes that would respond as data sources for the query. However, DD does not provide a method by which the application's QoI requirements can be met by iteratively issuing directed broadcasts, with modifications to the parameters contained in the broadcast.
Associative Rendezvous (AR) is another technique for matching application queries with corresponding sensor data. The AR technique is based on the publish-subscribe paradigm, with gateway nodes publishing requests for data, and sensor nodes advertising availability of the appropriate type of data. As both the request and advertisements conform to well-known schemas, they can be matched at intermediate “rendezvous” nodes which can then route the data from the sensor sources to the sink (the controller). However, like DD, the AR technique does not propose the use of an iterative mechanism to match the amount and quality of sensor data to an application's QoI requirements, thereby avoiding redundant operation of sensor nodes.
There are some examples of using probabilistic mechanisms to control the behavior of individual sensor nodes in the sensor network literature. The key idea in all these approaches is to define a certain probability with which each individual sensor node performs a certain task or assumes a certain role. Once this probability is defined, each sensor node switches to this task or role with the appropriate probability. However, all of the proposals and prior art in this domain deal with mechanisms by which nodes get to know of this probability, and not with methods by which such probabilities can be used to iteratively control the collective behavior of a set of sensor nodes. The problem of broadcasting single messages to all nodes in a sensor or ad-hoc network using probabilistic mechanisms has been studied in S. Y. Ni et al, “The Broadcast Storm Problem in a Mobile Ad Hoc Network”, Proceedings of MOBICOM 1999 (hereinafter “Ni”) and W. Peng and X. Lu, “On the Reduction of Broadcast Redundancy in Mobile Ad Hoc Networks”, Proceedings of ACM MOBIHOC, 2000 (hereinafter “Peng”). The fundamental idea is that each node that receives the packet re-broadcasts it with a certain probability. Ni and Peng include mechanisms for computing in off-line fashion this probability as a function of the network node density so as to ensure that all nodes most certainly receive the message, while avoiding redundant retransmissions. However, these approaches do not utilize a control loop for adjusting the probability to the “right level” desired by the controller, and do not talk about using the probabilities or other parametric values to adjust other behaviorial parameters (such as data reporting frequency or camera zoom level) of a subset of the sensor nodes.
LEACH is another approach that uses probabilistic activation of nodes in a sensor network, with the goal of keeping a certain pre-determined fraction of the nodes awake (to aid in packet routing) as described in W. Heinzelman, P. Chandrakasan and H. Balakrishnan, “An Application-Specific Protocol Architecture for Wireless Micro-sensor Networks”, IEEE Trans. on Wireless Communications, Vol. 1, 2002. In LEACH, each node independently chooses to become active or to sleep with a pre-designated probability value “p”. Moreover, nodes that have been active in the recent past modify their activation probability to be lower than p. In this way, the protocol ensures that over a suitably long timeframe, the job of remaining active for routing is distributed across all the nodes in the network and that no particular node faces energy depletion much faster than other nodes in the network. However, in LEACH, there is no notion of a control loop being used by a gateway node to dynamically tune the value p to ensure that it meets a target value N. Moreover, LEACH does not utilize broadcasts from a gateway node that are targeted to a specific sub-set of the sensor nodes (identified for example by the type or location of the sensor nodes) to activate varying numbers of sensors with different attributes (e.g., type or location).
Techniques for estimating the correct activation probability needed to ensure a certain level of spatial coverage, given a specific node density, are described in Y. Gao, K. Wu and F. Li, “Analysis on the Redundancy of Wireless Sensor Networks”, Proceedings of ACM WSNA, 2003. These approaches assume that the node density is uniform and known a-priori, and analyze how different activation probabilities impact the “degree of coverage”. They do not provide a method of dynamically controlling the activation probabilities without knowing the node density (and for non-uniform node densities), or discuss a broadcast-based technique to activate the right number of sensors without addressing them individually.
Finally, an alternative approach, called Broadcast Based Query (BBQ), is based on statistical learning to avoid the redundant communication of data value samples from a set of sensor nodes to a sink node, as described in A. Deshpande, C. Guestrin, S. Madden, J. Hellerstein, and W. Hong, “Model-Driven Data Acquisition in Sensor Networks”, Proceedings of VLDB, 2004. This approach assumes that applications specify their QoI requirements on the basis of statistical parameters, such as confidence intervals, on the data. A middleware component is then defined that initially collects samples from all the sensors and builds up a model of the correlation among the data from the various sensor sources. After the learning phase is over, the middleware can then compute an efficient sequence in which the sensor data sources should be sampled, potentially altering the sequence or even terminating the sampling process if the current samples provide enough statistical confidence on the values that the other (unsampled) sensors might have. While this approach can prevent the retrieval of redundant sensor data, it works only when the set of sensor nodes, and their associated attributes, remain fixed for all time. The BBQ approach does not employ a closed loop model that can adapt to dynamic changes in the underlying network; nor does it propose that the behavior of individual sensor nodes be modified (in response to such dynamic changes) to ensure collective conformation to an application's QoI requirements.
Thus, it would be advantageous to have a data gathering framework that allows for adaptively modifying the observed collective behavior of individual sensor nodes based on broadcasting of parameters. It would further be advantageous to have a mechanism for adapting sensor node behavior to the extent needed to satisfy an application's QoI requirements, without requiring explicit knowledge of the identity, number, or properties of individual nodes, even though the underlying sensor network may dynamically change over time.