1. Field of Art
The present invention generally relates to the field of image and video generation, and more specifically, to image and video generation using ground truth data.
2. Description of the Related Art
Various models have been used for generating images of documents in view of image degradations. The appropriate model for document image degradations has been the subject of numerous papers (Y. Li, D. Lopresti, G. Nagy, and A. Tomkins, “Validation of Image Defect Models for Optical Character Recognition,” IEEE Trans. Pattern Anal. Mach. Intell. 18, 2 (Feb. 1996), pp. 99-108. Pavlidis proposed a model including horizontal and vertical scaling, rotation, sampling rate, and quantization threshold (T. Pavlidis, “Effects of Distortions on the Recognition Rate of a Structural OCR System,” In Pro. Conf on Comp. Vision and Pattern Recog., pp. 303-309, Washington, D.C., 1983). Baird suggested a model where variable parameters include font size, spatial sampling rate, rotation, horizontal and vertical scaling, horizontal and vertical translation, pixel displacement, Gaussian point-spread function, pixel sensor sensitivity, and quantization threshold. (H. Baird, “Document Image Defect Models,” In Proc. Of IAPR Workshop on Syntactic and Structural Pattern Recognition, pp. 38-46, Murray Hill, N.J., June 1990; and H. Baird, “The State of the Art of Document Image Degradation Modeling,” In Proc. of 4th IAPR International Workshop on Document Analysis Systems, Rio de Janeiro, Brazil, pp. 1-16, 2000.) Smith experimented with a model that varies the width of the point-spread function, and the binarization threshold (E. H. Barney Smith and T. Andersen, “Text Degradations and OCR Training,” International Conference on Document Analysis and Recognition 2005, Seoul, Korea, August 2005.). Khoubyari and Hull simulated character defects by thickening character strokes and then randomly flipping some black pixels to white (Khoubyari, S. and J. J. Hull, “Keyword Location in Noisy Document Images,” Second Annual Symposium on Document Analysis and Information Retrieval, Las Vegas, Nev., pp. 217-231, April, 1993.). Kanungo et. al. model the curl distortions that result from scanning bound documents Kanungo, T.; Haralick, R. M.; Phillips, I, “Global and Local Document Degradation Models,” Document Analysis and Recognition, 1993., Proceedings of the Second International Conference on Document Analysis and Recognition ICDAR-93, Volume, Issue, 20-22 Oct. 1993 Page(s): 730-734.). Zi also considers the effects of bleed-through from text and images underneath or on the reverse side of the document of interest (G. Zi, “Groundtruth Generation and Document Image Degradation,” University of Maryland Language and Media Processing Laboratory Technical Report (LAMP-TR-121), 2005.).
These models deal with images on scanners or bi-level images. These models do not provide imaging of non-planar forms. Further, these models use large sets of data for imaging video. What is needed is a more generalized imaging system model.