Images contain a wealth of information. When images are captured with, for instance, a text string in the image being covered by a person standing in front of the image or covered by some other objects/images within the image, then, there is arises a difficulty in summarizing a text string within the image or translating an embedded text string into a different language (e.g., English to Chinese). As a result, the translations are not accurate.