The demand of suitability of robots for everyday environments requires that the robots are capable to localize objects under challenging environmental conditions such as bad lighting, reflections, or occlusions, for example. Since object classification and localization goals often cannot be achieved from single measurements, active strategies are required to overcome these problems and enable an autonomous approach to scene recognition by robots.
The knowledge of objects available in a scene or in an environment of a robot is often an important precondition for the robot to be able to perform further tasks or actions, e.g., grasping of objects or further acting with regard to the objects.
Most of current active vision systems focus on information theoretic quality measures for next best view planning but do not follow probabilistic planning strategies. In literature many approaches to active perception exist. In [1], some works are listed which mainly focus on action selection based on uncertainty reduction. A more recent work [2] deals with active feature matching for 3D camera tracking. In [3] a greedy approach to visual search is presented. It focuses on gaze planning for object detection including context information based on prior knowledge.
A partially observable Markov decision process (POMDP) is a general model for planning under uncertainty. POMDPs statistically reason over costs for control actions for finding an optimal action policy. In various embodiments, continuous stochastic domains are considered. Therefore, when considering the state of art in following, it is focused mainly on known works which deal with continuous state spaces.
In [4], continuous domains are approximated by grids. For the resulting discrete problem, the computational complexity increases strongly in high dimensional state spaces. Here, a coarser sampling, however, would reduce the complexity at the costs of lower accuracy.
Some of known approaches consider continuous domains directly. In various known works, see [5], [6], for example, all possible situations are evaluated in an off-line process. Hence, the best action to execute is initially specified. In Porta [7], these ideas are extended by applying it on continuous action spaces and observations. However, all these known approaches or works are restricted to small domains in order to keep the extensive pre-processing required feasible. As in many applications, action values largely depend on the situation, they cannot be determined prior to execution. An overview on online planning algorithms or methods for POMDPs is given in [8]. Most of the known works follow real-time constraints by improving rough off-line policies with the outcome of online strategies. Therefore, methods such as branch and bound pruning, Monte-Carlo sampling or heuristic searches are required. However, such methods limit the number of reachable and relevant states (see [9], for example). In [10], a deterministic forward heuristic search algorithm or method is used for probabilistic planning. Most of the known look-ahead search methods reduce the complexity either by optimizing the action or observation space or by orienting the search towards the most relevant actions and observations.
Known approaches, which are closer related to active sensing, use POMDPs for evaluating sensing costs in order to find the most promising actions. In Guo [11], for example, a framework for optimal decision-making with respect to classification goals is proposed. Here, the ideas are applied to rock classification with an autonomous rover. The costs for acquiring additional information are balanced against penalties for misclassification. In [12], a POMDP is used for cost-sensitive feature acquisition and classification. The expected reward of a selected action for a robot is calculated from the classification costs with respect to the current belief state. Spaan [13], in turn, suggests extending the planning strategy by combining costs with payoffs coming from information theoretic uncertainty measurements.