In general, one or more articles includes, without limitations, paper documents like newspapers, magazines, Portable Document Format (PDF), printed documents, brochure, images, scanned documents, books, etc. The one or more articles may contain data which can include, without limitations, texts, characters, words, images, symbols, and letters etc. A user may wish to convert the one or more articles into an Optical Character Recognition (OCR) format so that the data contained in the one or more articles can easily be recognized and extracted. One or more digital cameras are used as scanners which obtain an image of the one or more articles and then perform scanning of the image of the one or more articles. By scanning, the data contained in the one or more articles are converted into digital format for obtaining the OCR format of the one or more articles. Typically, for scanning the one or more articles the one or more digital cameras uses a predefined resolution focus to recognize and extract the data from the one or more articles.
In a conventional approach, the one or more digital cameras perform scanning of the data in the one or more articles. Such scanning may not recognize and extract each data of the one or more articles. Such failure of scanning may be due to one or more factors which include, without limitations, low resolution focus, small sized data, skewed regions of data and fast OCR scan settings etc. Thus, the conventional approach fails to extract small sized data and/or the data from skewed regions of the one or more articles. Hence, in such a way, the data extraction is failed due to failure of the scanning. Further, in the conventional approach, the extraction of texts or characters from an image involves a great challenge. The extraction of texts or characters from the image is a challenging process because the scanning is performed only once for entire article. Such one-time scanning fails to extract each data from the one or more article. In such a way, conversion of the one or more articles into the OCR format fails. Furthermore, the conventional approach comprises only single scanning process applicable for both textual regions and image regions. In such a case, extraction of the data contained especially in the image regions is a tedious process.