Digital ink processing systems must deal with the huge variability in handwriting and drawing that occurs due to the differing styles of individual writers. As a result, most systems perform a number of pre-processing steps to limit this variation. Examples of such systems include handwriting recognition systems, digital signature verification systems, document analysis systems, and digital ink searching systems.
An instance of such a procedure is orientation normalization which is used to reduce the variance of the input by aligning the digital ink as if it was written using a standard orientation on the page (for example, written left-to-right on a horizontal line for Latin character based scripts). By aligning the digital ink in such away, the ink processing system can ignore the effects of variation in orientation, and as such can be made simpler, more robust, and more accurate.
Orientation normalization is usually performed as one of the first steps in a digital ink processing system, and is used to minimize error in later stages (for example, line, word, and character segmentation, feature extraction, etc.) Generally, the angle of a segment of digital ink relative to a standard reference angle (e.g. horizontal) is estimated and used to re-orient the digital ink such that the angle of digital ink matches the reference angle.
Orientation normalization for Latin character scripts is often performed using baseline correction; where the baseline of a line of text is defined as the imaginary natural line on which a user places characters that do not have descenders (e.g. “a”, “b”, “c”, “d”, “e”, “f”, “h”, etc.). This is done by estimating the baseline of a segment of digital ink and then rotating the ink to be horizontal. Whilst most systems assume baselines are roughly linear, some systems attempt to model baseline drift using more sophisticated models such as splines [A. Hennig, N. Sherkat, and R. Whitrow, “Zone Estimation for Multiple Lines of Handwriting using Approximating Spline Functions”, Fifth International. Workshop on Frontiers in Handwriting Recognition (IWFHR), pp. 325-328, September 1996].
A significant amount of research has been performed on orientation estimation and normalization for digital ink, with particular emphasis on techniques that are applicable to Optical Character Recognition systems. Early research systems relied on heuristics and empirical thresholds [W. Guerfali and R. Plamondon, “Normalization and restoring on-line handwriting”, Pattern Recognition, 26 (3), pp. 419-431, 1993; S. Madhvanath and V. Govindaraju, “Using holistic features in handwritten word recognition”, United States Postal Services (USPS), pp. 183-198, 1992], along with simple techniques such as linear regression through stroke minima [R. Bozinocic and S. Srihari, “Off-line cursive script word recognition”; IEEE Transactions of Pattern Analysis and Machine Intelligence 11, pp. 69-83, 1989]. Due to the brittle nature of these techniques, more sophisticated systems using projection profiles [A. Vinciarelli and J. Luettin, “A New Normalization Technique for Cursive Handwritten Words”, Pattern Recognition Letters 22, pp. 1043-1050, 2001; M. Brown and S. Ganapathy, “Preprocessing Techniques for Cursive Script Word' Recognition”, Pattern Recognition 16 (5), pp. 447-458, 1983] and generalized projections [G. Nicchiotti and C. Scagliola, “Generalised Projections: a Tool for Cursive Handwriting Normalisation”, Fifth International Conference on Document Analysis and Recognition (ICDAR), September 1999] were developed. Other techniques have since been developed, including: least squares and weighted least-squares [M. Morita, S. Games, J. Facon, F. Bortolozzi, and R. Sabourin, “Mathematical Morphology and Weighted Least Squares to Correct Handwriting Baseline Skew”, Fifth International Conference on Document Analysis and Recognition (ICDAR), pp. 430-433, September 1999; T. Breuel, “Robust least square baseline fording using a branch and bound algorithm”, Proceedings of the SPIE, pp. 20-27, 2002], geometric modelling and pseudo-convex hull [M. Morita, F. Bortolozzi, J. Facon, and R. Sabourin, “Morphological approach of handwritten word skew correction”, SIBGRAPI'98, International Symposium on Computer Graphics, Image Processing and Vision, Rio de Janeiro, Brazil, pp. 456-461, October 1998], techniques based on the Hough transform [A. Rosenthal, J. Hu and M. Brown, “Size and orientation normalization of on-line handwriting using Hough transform”, ICASSP'97, Munich, Germany, April 1997], model based methods [Y. Bengio and Y. LeCun, “Word normalization for on-line handwritten word recognition”, Proceedings of the International Conference on Pattern Recognition, pp. 409-413, October 1994], skew detection using Principal Component Analysis [Steinherz, N., Intrator, and E. Rivlin. “Skew detection via principal components analysis”, Proceedings of the International Conference on Document Analysis and Recognition (ICDAR), pp. 153-156, 1999], and baseline estimation using approximating spline functions [A. Hennig, N. Sherkat, and R. Whitrow, “Zone Estimation for Multiple Lines of Handwriting using Approximating Spline Functions”, Fifth International. Workshop on Frontiers in Handwriting Recognition (IWFHR), pp. 325-328, September 1996].
Some orientation normalization techniques have been disclosed in prior art patent specifications, including the use of boundary projections combined with the Hough transform [T. Syeda-Mahmood, “Method of grouping handwritten word segments in handwritten document images”, U.S. Pat. No. 6,108,444]; a system for digit normalization of scanned images that works by finding the bounds of a parallelogram that completely encloses the character image [R. Vogt, “Handwritten digit normalization method”, U.S. Pat. No. 5,325,447; 3]; methods that use linear projection and a clustering algorithm to detect elements in a histogram that correspond to ascender, descender, and base lines [W. Bruce, et al, “Estimation of baseline, line spacing and character height: for handwriting recognition”, U.S. Pat. No. 5,396,566; J. Kim, “Baseline Drift Correction of Handwritten Text”, IBM Technical Disclosure Bulletin 25 (10), March 1983]; and a least squares calculation combined with rotation around a centroid for the normalization of signatures [F. Sinden and G. Wilfong, “Method of normalizing handwritten symbols”, U.S. Pat. No. 5,537,489] in an online signature verification system.
Whilst the techniques described above are sometimes effective, they suffer from a number of significant limitations. For example, many assume that all lines of written text are oriented at the same angle on the page, and thus cannot handle pages of arbitrarily rotated text lines. Other limitations include the fact that the algorithms require significant processing resources (e.g. Hough transform), are quantized (e.g. Hough transform), do not work well for short segments of text (e.g. projection methods), are brittle due to empirically estimated thresholds (heuristic and rule-based techniques), or are sensitive to ascenders, descenders and outliers (e.g. least squares regression and projection techniques).
The azimuth of a writing implement is defined in [R. Poyner, “Wintab Interface Specification 1.1: 16- and 32-bit API Reference” LCS/Telegraphics] as the “clockwise rotation of the cursor about the z axis through a full circular range”. In other words, if x and y define the horizontal and vertical axes of a sheet of paper, and z defines the axis that is normal to the paper, the azimuth is the rotation of the pen about the z axis. Some pen-based computing systems are able to measure the azimuth of a writing implement during the generation of digital ink, including Wacom graphics tablets and Netpage pens [K. Silverbrook et al, “Sensing Device”, WO 02/42989].