To capture an image of a large document for translation, a user may be required to take several still images of the document. The user may capture a first image of the document from a first position. A mobile application may then direct the user to move the device to another part of the document to capture the next still image. Typically, movement of the device stops when the next position is reached otherwise the images may be blurred. To translate the captured images, users may have to align text carefully so that the entire text can be captured with a single image.