The present invention relates to handling memory faults, and more particularly to a system and method usable in sensor networks for handling memory faults.
Sensor networks typically comprise multiple wireless sensor devices. Each wireless sensor device acts as a node in the sensor network. Data sensed at each sensor node is transmitted to a base station via the sensor network. Recent developments in sensor network technology have enabled distributed applications to run on sensor networks. Sensor networks are increasingly being used in industrial, commercial, and medical applications. The growing popularity and importance of sensor networks in these fields have led to increased demand for sensor networks. For example, CodeBlue, a prototypical medical sensor network platform used in emergency care and disaster response scenarios, requires uninterrupted reliable, long-lived and secure operation. Other examples include sensor networks implemented for use in industrial processes and security systems. Any unexpected failure in a sensor network system can be detrimental, ranging from financial losses to life-threatening situations. Software faults are a common reason for failure of sensor nodes. In particular, corruption of the application and kernel state due to a lack of protection from other applications can lead to a crash or freeze of the node or corruption of sensed data.
Sensor nodes typically have a simple architecture. Accordingly, the architecture of sensor nodes usually does not include features such as a memory management unit, privileged mode execution, etc., which are used in desktop/server class systems to isolate or protect the data and code of one program from other programs. Further, the micro-controllers used in sensor nodes typically have separate memories for program and data storage, and the entire data memory of the sensor node is accessible to all program modules running on the sensor node via a single address space.
Although the sensor node architecture is relatively simply, the software running on a sensor node can be complex. The software complexity arises out of a need to support diverse types of sensors, multiple distributed middleware services, dynamic code updates, and concurrent applications in a resource-constrained environment. In order to implement the software components, programmers have to deal with several resource constraints, concurrency, and race condition issues. Furthermore, limited debugging support on sensor node hardware makes programming errors common. These errors can lead to memory faults in which applications corrupt the memory used by other applications. In addition to corruption by other applications, memory faults can also be caused by hardware failures, transient errors, etc. The impact of memory faults can be quite severe, ranging from node freeze (fail-stop), to silent corruption, in which bad data, generated by the afflicted sensor node, further propagates through the sensor network and disrupts the operations of other sensor nodes in the network.
In high end desktop/server class systems, approaches for protecting against memory corruption typically fall into one of two categories: static program analysis tools or runtime memory access checkers. Static program analysis tools rely either on language restrictions of type-safe languages like Java and ControlC, or on programmer annotations/language extensions for flagging illegal memory accesses. These tools impose considerable restrictions on the language, additional burden on programmers to guarantee safety, or add significant resource inefficiencies that cannot be efficiently implemented in sensor network systems. Run-time checks to stop illegal memory access in desktop/server class systems have been pursued at the expense of added overhead. For example, Software-based fault isolation (SFI) relies on a large virtual address space that is statically partitioned in order to enforce safe sharing of the virtual address space by multiple cooperative modules. Such static partitioning cannot be used in the severely limited address spaces specific to sensor nodes. Several hardware assisted protection techniques have also been proposed for high-end desktop/server class systems. However, the hardware solutions involve complex and expensive hardware extensions, which are not viable for the simple hardware architecture of sensor nodes.
In the area of wireless sensor networks, most research has focused on developing network-level protocols either to diagnose/localize problems, or to overcome unreliability using such concepts as voting and trust. However, research into node-level support for protecting against memory faults is very limited, with reboot of the entire node the most common approach adopted on a sensor node.