1. Field of the Invention
The present invention relates to sensor scheduling methods and systems and more specifically, to long term (non myopic) scheduling based on multisensor measurements taken from a plurality of objects or targets.
2. Description of Related Art
Use of information-theoretic measures such as entropy for sensor management has been known for many years now. Hintz et al. references [10,11] use the Shannon entropy while Schmaedeke and Kastella, reference [12], have chosen to use Kullback-Leibler (KL) divergence as measure of information gain. Most of the literature is directed to managing sensors using information-theoretic measures to maximize kinematic information gain only, references [1-2]. This is done without any consideration for the current situation and system performance in that the goal is to get as much information as possible. Thus, it is an open-loop approach to manage sensors, references [1-4]. Also, some prior art exists in managing sensors for maximizing identification (ID) and search as well, references [3-4]. In all of these approaches, the idea is to pick the sensing actions that maximize the instantaneous expected information gain. Thus, these approaches are myopic in the sense that they maximize the immediate reward without consideration for future actions/rewards.
There is also some recent prior art in managing sensors for closed-loop control, but only based on kinematic need, reference [5]. This need is calculated based on the current kinematic track state and the desired kinematic accuracy. The sensor gains are calculated and sensors are scheduled based on the kinematic need and gain. No direction is provided on how to extend this work for general system problems.
Non-Myopic, Long-Term Planning
By contrast, long-term approaches have the potential to produce the best results, but the computation time required for even simple environments is enormous when compared to near-term myopic approaches. Several researchers have come up with solutions that provide approximate answers, references [6-9]. While these approximations improve computation time, they are still very computational intensive.
Another prior art approach for long-term planning has been proposed in a reference, [14]. This approach called sparse sampling considers a chain of actions up to a certain depth time when making a decision. The advantage over an exhaustive search is that this approach covers less and less of the action space as the algorithm looks farther ahead into the future. This makes sparse planning significantly faster than other long-term approaches that consider the action tree in its entirety. In an exhaustive search, the belief state grows exponentially with look-ahead depth. It grows as classes (i.e. decision points). For example, if there are three possible classes and there are five decisions to make before the depth time is reached, then the belief state will be 35=243 entries long at the bottom of the action tree. An example of this approach is the sparse sampling algorithm in the cited reference [14] for Markov decision processes proposed by Kearns, Mansour, and Ng. This algorithm is exponential in the depth of the search tree and is not very applicable to practical problems. Additionally, their reward function is based on information gain in contrast to that used by the system and method of the present invention.
Accordingly, it is a primary object of the present invention to provide a method and system which provides a more accurate long-term sparse sampling planner.