Analysis of handwritten documents to identify the writer is of extreme importance in the criminal justice system. Numerous cases over the years have dealt with evidence provided by handwritten documents such as wills and ransom notes. Handwriting has long been considered individualistic, as evidenced by the importance of signatures in documents. However, the individuality of writing in handwritten notes and documents has not been established with scientific rigor, and therefore its admissibility as forensic evidence can be questioned.
Writer individuality rests on the hypothesis that each individual has consistent handwriting that is distinct from the handwriting of other individuals. However, this hypothesis has not been subjected to rigorous scrutiny with accompanying experimentation, testing, and peer review. One of our objectives with this invention is to make a contribution towards this scientific validation.
The problem to be solved by the invention relates to setting up a methodology for validating the hypothesis that everybody writes differently. The invention is built upon recent advances in developing machine learning algorithms for recognizing handwriting from scanned paper documents; software for recognizing handwritten documents has many applications, such as sorting mail with handwritten addresses. The task of handwriting recognition focuses on interpreting the message conveyed—such as determining the town in a postal address—which is done by averaging out the variation in the handwriting of different individuals. On the other hand, the task of establishing individuality focuses on determining those very differences. What the two tasks have in common is that they both involve processing images of handwriting and extracting features.
Pertinent references useful in understanding the present invention include the following:    1) Huber R A, Headrick A M. Handwriting identification: facts and fundamentals. Boca Raton: CRC Press, 1999.    2) Osborn A S. Questioned document. 2nd ed. Albany, NY: Boyd Printing, 1929.    3) Lohr, S L. Sampling: design and analysis. Pacific Grove, CA: Duxbury Press, 1999.    4) Srihari S N, Cha S-H, Arora H, Lee S. Handwriting identification: research to study validity of individuality of handwriting & develop computer-assisted procedures for comparing handwriting. Buffalo (NY): University at Buffalo, State University of New York; 2001 TR No.: CEDAR-TR-01-1.    5) Gilbert A N, Wysocki C J. Hand preference and age in the United States. Neuropsychologia. 1992; 30:601-608.    6) Duda R O, Hart P E. Pattern classification and scene analysis. NY: Wiley, 1973.    7) Srihari S N. Feature extraction for locating address blocks on mail pieces. In: Simon J C, ed. From pixels to features. Amsterdam: North Holland, 1989; 261-273.    8) Srihari S N. Recognition of handwritten and machine-printed text for postal address interpretation. Pattern Recognition Letters 1993; 14:291-303.    9) Govindaraju V, Shekhawat A, Srihari S N. Interpretation of handwritten addresses in US mail stream. In: Proceedings of the 2nd Int Conf on Document Analysis and Recognition; 1993 Oct. 20-22; Tsukuba Science City, Japan: International Association for Pattern Recognition, 1993.    10) Srikantan G, Lam S W, Srihari S N. Gradient-based contour encoding for character recognition. Pattern Recognition 1996; 29:1147-1160.    11) Srikantan G, Lee D S, Favata J T. Comparison of normalization methods for character recognition. In: Proceedings of the 3rd Int Conf on Document Analysis and Recognition; 1995 August 14-16; Montreal: International Association for Pattern Recognition, 1995.    12) Otsu N. A threshold selection method from gray-scale histograms. IEEE Trans System, Man, and Cybernetics 1979; 9:62-66.    13) Freeman H. On the encoding of arbitrary geometric configurations. IRE Trans Electronic Computers 1961; 18:312-324.    14) Kim G, Govindaraju V. A lexicon-driven approach to handwritten word recognition for real-time applications. Trans on Pattern Analysis and Machine Intelligence 1997; 19:366-379.    15) Favata J T, Srikantan G, Srihari S N. Handprinted character/digit recognition using a multiple feature/resolution philosophy. In: Proceedings of the Fourth Int Workshop on the Frontiers of Handwriting Recognition; 1994 December 7-9; Taipei: NA, 1994.    16) Gonzalez R C, Woods R E. Digital image processing. 3 ed. Reading, MA: Addison-Wesley, 1992.    17) Mirkin B. Mathematical classification and clustering. Dordrecht: Kluwer Academic Pub, 1996.    18) Mitchell T M. Machine learning. Boston: McGraw-Hill, 1997.    19) Lee D S, Srihari S N, Gaborski R. Bayesian and neural network pattern recognition: a theoretical connection and empirical results with handwritten characters. In: Sethi I K, Jain A K, ed. Artificial neural networks and statistical pattern recognition. Amsterdam: North Holland, 1991:89-108.    20) Srihari et al., U.S. Pat. No. 4,654,875, System to Achieve Automatic Recognition of Linguistic Strings.    21) Kuan et al., U.S. Pat. No. 5,058,182, Method and Apparatus for Handwritten Character Recognition.    22) Shin et al., U.S. Pat. No. 5,524,070, Local Adaptive Contrast Enhancement.    23) Shin et al., U.S. Pat. No. 5,257,220, Digital Data Memory Unit and Memory Unit Array.    24) Fenrich et al., U.S. Pat. No. 5,321,768, System for Recognizing Handwritten Character Strings Containing Overlapping And/Or Broken Characters.    25) Govindaraju et al., U.S. Pat. No. 5,515,455, System for Recognizing Handwritten Words of Cursive Script.