Embodiments of the present invention generally relate to pattern recognition, and more specifically, relate to a method and system for information integration-based pattern recognition.
Pattern recognition refers to automatic or semi-automatic processing and determination of patterns such as diagrams, voice, or characters using computer technology. Pattern recognition has been widely applied in various fields, including recognition of geometric diagrams. One type of geometric graph is a two-dimensional (2D) graph-based diagram. In the context of the present application, the term “diagram” is a set of primitives and mutual relationships between respective primitives. The primitives may be various 2D diagrams that have a corresponding shape, for example, rectangle, oval, round, triangle, parallelogram, etc. The primitives may have associated texts or may not have texts. The relationship between primitives is generally expressed by line. The line may include a straight line, an arc line, and various kinds of curves. Further, the line may be unidirectional, or directional (including one-way or two-way), which may be represented by an arrow, as an example. Examples of diagrams include, but are not limited to, a flow chart, a block diagram, a tree diagram, a net diagram, etc.
Diagrams have been widely applied in various fields. For example, in many enterprises, research institutions, colleges and various other organizations, large amounts of information are presented through diagrams. For example, a presenter may express a theme(s) with diagrams when making various presentations. For another example, a considerable number of diagrams exist in various books, newspapers, articles and organizations. It is desirable to digitize the diagrams on a non-electronic medium using a pattern recognition technology so as to convert it into a digital format. A known manner is converting a diagram into an image by an image acquisition apparatus such as a camera, and then recognizing the primitives and their connection relationships in the diagram.
However, at least due to the following reasons, the conventional method purely based on image processing is error prone. First, many diagrams are hand-drawn by users, such that the primitives and/or connection in the diagrams are of certain irregularity. Second, in some circumstances, users can only use an image acquisition apparatus on a portable device such as a mobile phone, a tablet computer, a personal digital assistant (PDA) and the like to capture an image of a diagram in a short time. Due to the limitation of factors such as image shooting conditions, resolution and the like, the image quality might not be high, which is adverse to recognition accuracy. Additionally, in a diagram, many primitives have a higher similarity. For example, an ellipse is similar to a circle, and a rectangle is similar to a square, and so on. This brings challenge to pattern recognition as well.