Sketches and diagrams are an essential means of communicating information and structure in many different domains, and can be important parts of the early design process, where they help people explore rough ideas and solutions in an informal environment. Despite the ubiquity of sketches, there is still a large gap between how people naturally interact with diagrams and how computers understand them today.
One field where sketches and diagrams are especially widely used is in chemistry, where the information encoded in a diagram provides essential information about a molecule's identity, chemical properties, and potential reactions. When chemists need to describe the structure of a compound to a colleague, they typically do so by drawing a diagram. When they need to convey the same structure to a computer, however, they must re-create the diagram using programs like CHEMDRAW that still rely on a traditional point-click-and-drag style of interaction. While such programs offer many useful features and are very popular with chemists, these CAD-based systems simply do not provide the ease of use or speed of simply drawing on paper.
Current work in sketch recognition can, very broadly speaking, be separated into two groups. The first group focuses on relationships between geometric primitives (e.g., lines, arcs, etc.), specifying them either manually (Hammond 2006, Gross 1996, Alvarado 2004) or learning them from labeled data (Szummer 2005, Sezgin 2008). Full citations for these and other references are provided below. Recognition is then posed as a constraint satisfaction problem, as in (Hammond 2006, Gross 1996), or as an inference problem on a graphical model, as in (Szummer 2005, Sezgin 2008, Alvarado 2004). However, in real-world sketches, it is difficult to extract these primitives reliably. Circles may not always be round or closed, line segments may not be straight, and stroke artifacts like pen-drag, over-tracing, and stray ink may introduce false primitives that lead to poor recognition. Furthermore, in many systems, the recognizer discards potentially useful information in the original strokes after it has extracted the primitives.
The second group of related work focuses on the visual appearance of shapes and symbols. These include parts-based methods (Oltmans 2007, Shilman et al., 2004), which learn a set of discrimitive parts or patches for each class, and template-based methods (Kara 2004, Ouyang and Davis 2009), which compare the input symbol to a library of labeled prototypes. The main advantage of vision-based approaches is their robustness to variations in drawing styles, including artifacts such as over-tracing (drawing over a previously drawn stroke) and pen drag (failing to lift the pen between strokes). However, these methods do not model the spatial relationships between neighboring shapes, relying on local appearance to classify a symbol.
There have also been previous efforts to recognize chemical diagrams. A sketch-based system that helps students visualize the three dimensional structure of an organic molecule is described in Tenneson (2007). Their system was able to avoid many of the challenges in sketched symbol detection by requiring that all symbols be drawn using a single stroke. Casey et al. (1993) developed a system for extracting chemical graphics from scanned documents, but their work focused on scanned printed chemical diagrams rather than freehand drawings. Ouyang and Davis (2007) presented a simpler chemistry sketch recognition system that was limited to symbols drawn using consecutive strokes.