A user may position a camera to capture images of an object from different viewpoints in order to generate a three-dimensional (3D) model of the object based on the captured images. However, in order for the 3D model to be generated, the images typically need to be captured from specific viewpoints rather than arbitrary viewpoints. It is therefore difficult for the user to determine from which specific viewpoints the images should be captured from in order to generate the 3D model. If the user captures more images than necessary to make sure that there are a sufficient number of images from the specific viewpoints to generate the 3D model, then many redundant and unnecessary images may be captured. On the other hand, if the user does not capture enough images from the specific viewpoints needed to generate the 3D model, then the user may not be able to generate the 3D model.