Sensor Resource Management (SRM) is defined as a control problem for allocating available sensor resources to obtain the best awareness of a situation. SRM is important in terms of the benefits it provides over non-coordinated sensor operation. By automating the process, it reduces operator workload. In an SRM system, an operator defines sensor tasking criteria instead of controlling multiple sensors individually by specifying each operation to be performed by each sensor. In an automated SRM system, the operator concentrates on an overall objective while the system controls the details of the sensor operations. Additionally, the feedback within the SRM system allows for faster adaptation to the changing environment. Traditional problems that SRM deals with include insufficient sensor resources, a highly dynamic environment, varied sensor capabilities/performance, failures, etc. Desired characteristics of a good sensor manager are that it should be goal oriented, adaptive, anticipatory, account for sensor dissimilarities, perform in near real time, handle adaptive length planning horizons, etc.
In all sensor management systems, there will be a notion of system goal and system action. Both of these can be modeled and addressed using many methods, most of which fall into one of the following categories: control theory, optimization, and decision theory.
Control Theory:
In the control theory, one specifies the desired level of performance (or the reference trajectory) defining the management goal that the closed-loop system tries to achieve (see literature reference no. 38 in the List of Cited Literature References, as provided below). The error index is used as an action selection basis to reduce, and/or maintain as small as possible, any observable discrepancy.
Optimization:
If sensor management is modeled as an optimization problem, rather than specifying a desired performance level, the user defines a cost function that, once optimized, leads to the most desirable outcome. This optimization would lead to the best trade-off between the sensing action payoff and the associated costs. Optimization-based algorithms are among the techniques that have been applied the most to the sensor management problem. For example, Nash (see literature reference no. 39) uses linear programming to determine sensor-to-target assignment by using the trace of the Kalman filter error covariance matrices as the objective function. Malhotra (see literature reference no. 40) uses Dynamic Programming for solving a Markov process that determines minimum costs based on the final state and then works backwards. Washburn, et al. (see literature reference no. 41) present a sensor management approach based on dynamic programming to predict the effects of future sensor management decisions.
Decision Theory:
When a decision formulation is used, there is no clear notion of level of performance. As in the case of the optimization formulation, the objective is to choose the action that maximizes some expected utility function. Therefore, what is specified is the utility of executing a given action in a given situation. The best solution is the one that offers the highest utility (i.e., the best achievable performance). Solving such a decision problem can be done efficiently using graphical methods such as decision trees or influence diagrams. Performance objectives can be injected indirectly into decision trees as membership or belief functions of leaf nodes. Fung, et al., (see literature reference no. 42) use a decision theoretic sensor management architecture based on Bayesian probability theory and influence diagrams. Alternatively, Manyika and Durrant-Whyte (see literature reference no. 43) use a decision theoretic approach to sensor management in decentralized data fusion while Gaskell and Probert (see literature reference no. 44) develop a sensor management framework for mobile robots. Molina López, et al. (see literature reference no. 45) present a sensor management scheme based on knowledge-based reasoning and fuzzy decision theory.
Use of information-theoretic measures (such as entropy) for sensor management has been around for many years now. Most of the literature is in the area of managing sensors to maximize kinematic information gain only (see literature reference nos. 1-2). Some references exist regarding managing sensors for maximizing identification and search capabilities (see literature reference nos. 3-4). This is done without any consideration for the system's current situation and performance. One can use information-theoretic criteria such as entropy, discrimination information, mutual information, etc. In these cases, the goal is to determine tasks that maximize the information gain at a system-wide level and priorities are assigned based on this goal. Thus, it is an open-loop approach to managing sensors. Recent work on finite set statistics (FISST) and Joint multi-target probabilities (JMP—a subset of FISST) can also be applied in this context. The advantage of JMP is that there is a global notion of system task priority with all task priorities in a common space.
Examples of such applications are as follows. Hintz and McVey (see literature reference no. 46) used entropy for search, track and identification tasks. Literature reference nos. 47-54 describe using information measures such as entropy and discrimination gain for goals (such as determining resolution level of a sensor, determining priority of search and track tasks, etc.). Hintz et al. (see literature reference nos. 9 and 10) describes the use of Shannon entropy (as does the present invention), while Schmaedeke and Kastella (see literature reference no. 11) have chosen to use Kullback-Leibler (KL) divergence as a measure of information gain. Mahler (see literature reference nos. 56-58) proposed finite set statistics (FISST) which reformulates the problem of multi-sensor, multi-target data fusion as if it were a single-sensor, single-target tracking problem. It is then posed as a problem in statistical optimal nonlinear control theory. Musick, et al. (see literature reference no. 59) applied joint multi-target probabilities (JMP), a subset of FISST, to the problem of target detection and tracking.
Fusion and management for distributed systems must also be considered. When dealing with substantially different information that is received from multiple sources at different times, an efficient and accurate fusion method is required. To satisfy such a requirement, Zhang et al. (see literature reference no. 27) and Wang et al. (see literature reference no. 17) propose the use of a hierarchical decision fusion. As decisions are passed up a tree of intermediate levels, certain nodes are given higher weights than other nodes depending on the quality of the fused information. Alternatively, Stroupe et al. (see literature reference no. 16) apply distributed fusion to a network of independent robots. Stroupe et al. continue by explaining how to properly fuse measurements received from different frames of reference. Thomopoulos et al. (see literature reference no. 18) offers yet another approach. Thomopolous et al. provide a general proof that the optimal solution to distributed fusion amounts to a Neyman-Pearson test at the fusion center and a likelihood-ratio test at the individual sensors. A method proposed by Xiao et al. (see literature reference no. 19) involves a different approach, where each node updates its own data with a weighted average of its neighboring nodes. Qi et al. (see literature reference no. 20) detail a system of mobile agents which travel between local data sites by transmitting themselves from one node to the next. Such a system has the advantage of only having to transmit the agent itself, rather than the data, which could be several times larger in size. Finally, a discussion of the pros and cons of data and decision fusion is provided by Brooks et al., D'Costa and Sayeed, and Clouqueuer et al. (see literature reference nos. 21-23, respectively).
A variety of distributed management methods have been explored as well. For example, Durrant-Whyte et al. have primarily used information-theoretic approaches to address the problem (see literature reference nos. 24-26 and 43). Durrant-Whyte et al. describe using entropy measures to determine the relative values of possible actions. Xiong et al. (see literature reference no. 29) builds upon Durrant-Whyte's work and discuss a number of different approaches and issues when using information-centric methods in a distributed framework. An alternative is described by Ögren (see literature reference no. 28). Ögren describes using gradient methods to determine the best placement for a system of distributed sensors. Ögren's method involves setting up artificial potentials and letting these virtual force models converge to a global solution. Alternatively, Maholtra (see literature reference no. 8) addresses the impact of receiving measurements at different times, and how to properly model temporal effects.
Temporal effects often result in future actions and the need to plan for such future actions. When planning future actions, one can choose how far ahead (in time) to search when trying to pick the optimal action. Short-term (or myopic) approaches only look ahead one step; the goal of the planner is to pick the action that will yield the best result after the action is executed. Long-term methods explore many actions into the future. Their goal is to choose an action that will yield the best result after a certain number of actions have been performed. The differences between these two options are illustrated in FIG. 1.
FIG. 1 illustrates a state and decision space 100, showing a comparison of short-term and long-term planning approaches. As shown, short-term planning 102 evaluates the reward (improvement in system state) versus the cost of an action for a single-step look ahead. The short-term planning uses the evaluation to determine the optimal action. However, although such short-term planning may be computationally cheap, the best immediate action may not be the optimal action. Alternatively, long-term planning 104 evaluates the reward (improvement in system state) versus the cost of action for multiple look ahead steps to determine the optimal action to take. Thus, long-term planning 104 compares the action sequences for multiple steps in the future. Although long-term planning 104 is computationally expensive, it results in a non-myopic, better optimality.
Long-term approaches have the potential to produce the best results, but the computation time required for even simple environments is enormous when compared to short-term approaches. Several researchers have come up with solutions that provide approximate answers to reduce the computation time.
A large number of SRM problems can be formulated as belonging to the class of Markov Decision Process (MDP) problems. In a MDP, future states are assumed to be the result of applying actions to the current state, ignoring the total history of the state space. In centralized management, nodes with complete knowledge of the state space (full-awareness nodes, or FANs) maintain the current state, and update the state with incoming measurements or the passage of time. This new state can then be used to determine future states. Similarly, in a decentralized system, each node's state is only dependent on that node's prior state, input from the environment, and process noise, just like any other Markovian process. Each node maintains its own individual picture of the state space, but since inter-node communication is imperfect, these state representations are often dramatically different from node to node.
For example, Krishnamurthy describes a technique involves a multi-arm bandit method and that utilizes hidden Markov models (see literature reference nos. 5 and 6). Krishnamurthy employs two approaches which limit the number of states accessible to each track to a finite number. While these approximations improve computation time, they are all centralized methods and still very intensive. Bertsekas and Castanon (see literature reference no. 7) propose a method that uses heuristics to approximate a stochastic dynamic programming algorithm, while Malhotra (see literature reference no. 8) suggests using reinforcement learning.
Kearns, Mansour, and Ng have also previously proposed a sparse sampling algorithm (see literature reference no. 14) for Markov decision processes. The sparse sampling algorithm is exponential in the depth of the search tree and is not very applicable to practical problems. In addition, the reward function in the algorithm is based on information gain. A variety of distributed management methods and issues have been explored as well. However these distributed methods are myopic and do not address long-term planning.
The technique mentioned above is limited and incomplete because it cannot achieve multiple performance objectives in a quantifiable and systematic, closed-loop, control manner. Additionally, due to the several steps involved in arriving at the common metric and bringing the steps into the same framework, nothing heretofore devised provides a single system-level solution to sensor management.
Thus, a continuing need exists for a sensor resource allocation system that is configured to operate as a fully distributed, short-term/long-term planner.