In some surveillance, reconnaissance, and exploration tasks for real-world applications, one or more objects may need to be detected and tracked. Conventional tracking methods may be based on global positioning system (GPS) data or camera vision. However, conventional GPS-based or vision-based tracking methods may be inadequate for certain applications. For example, conventional GPS-based tracking methods may not be useful in places with poor GPS signal reception or if the tracked objects do not have GPS receivers located on them. Conventional vision-based tracking methods may lack the capability for precisely tracking a group of moving objects. An aerial vehicle carrying a payload (e.g., a camera) can be used to track objects. In some cases, one or more operators may have to manually select the moving objects to be tracked, and manually control the aerial vehicle/camera to track the moving objects. This limited tracking ability may reduce the usefulness of aerial vehicles in certain applications.