There is a well-known technique for inputting a plurality of original images by reading a document in two passes and outputting an image where the plurality of original images are combined. According to this technique, two methods are used. One is a method for combining the plurality of original images by placing the images on positions determined by pattern matching, and the other is a method for combining the images at positions where the images are simply arranged. One of the two methods are selected by a user, for example. When the pattern-matching method is executed according to a selection of the user, but ends in failure, the method for simply arranging the two images is executed.