Recent years have seen a rapid proliferation in the use of computing devices in the area of digital typography with respect to creating and editing electronic documents. Indeed, it is now commonplace for individuals and businesses to use digital typography to create customized webpages, e-mails, magazines, marketing materials, and other electronic documents utilizing desktop and laptop computers, mobile devices, tablets, smartphones, or other computing devices.
Digital typography includes the use of digital fonts. Recent years have also seen an increase in the type and variety of digital fonts utilized in electronic documents. For example, an electronic document can use digital fonts selected from a collection of thousands of digital fonts. Further, individuals can effortlessly find, access, and install additional digital fonts on a computing device to be used for creating electronic documents.
A major challenge that has arisen with the increase in the number of digital fonts is the capability to correctly detect and recognize digital fonts. For example, an individual sees a font in a document or image and desires to use the same font in an electronic document. As such, the font in the document or image must be correctly identified before the user can use it as a digital font. In general, the ability to detect and recognize digital fonts can greatly enhance an individual's experience when creating and editing electronic documents.
While some recent font classification systems have been developed to recognize fonts using machine-learning algorithms, these recent font classification systems still struggle in the area of intra-class variances within a class of digital fonts (e.g., variations between glyphs of the same font). While this problem exists with respect to glyphs (e.g., unique symbols that make up words) that use the Roman alphabet, the magnitude of the problem increases with other languages. To demonstrate, the Roman alphabet uses 26 different glyphs while Japanese writing includes over 50,000 glyphs. Other languages also include thousands of glyphs.
As the number of glyphs increase, such as in the case of Japanese fonts, the number of intra-class variances within the glyph content likewise increases. In many cases, due to the number of glyphs, recent font classification systems do not learn every glyph during training, which then leads to misclassification and inaccurate results. As another issue, particularly with Japanese fonts, the visual difference between different Japanese writing types (e.g., logographic kanji and syllabic kana) is significant, and the large difference between the two glyph styles further magnifies the intra-class variation issue in Japanese font recognition. Further, because of the visual difference between different Japanese writing types, recent font classification systems require significantly more training samples to correctly recognize and classify Japanese fonts. In sum, even recent font classification systems fail to provide the level of generalization and accuracy needed to correctly identify Japanese fonts.
Furthermore, recent font classification systems that employ machine-learning algorithms to classify fonts require large amounts of memory and computational requirements. In particular, recent font classification systems require additional memory, processing resources, and time to converge a neural network to identify accurate font feature vectors and font probability vectors. Also, due to the additional requirements, recent font classification systems are often unstable. Further, because of these requirements, client devices, particularly mobile ones, cannot execute these neural networks.
These and other problems exist with regards to detecting and classifying digital fonts, especially non-Roman fonts (e.g., Japanese fonts, Chinese fonts, et al.), using existing systems and methods.