In the field of Optical Character Recognition (OCR) character detection and recognition, since a single character-level annotation manner (e.g., character position annotation) needs to consume a lot of manpower and material resources, some current data sets in an open source real scenario are inclined to annotate an whole bounding outline of a whole word or a text bar, and text information of the whole word or the whole text bar. Such annotation manner eases annotation difficulty and costs to a certain degree, but brings about some negative influence, e.g., some typical character detection methods based on the character level cannot be effectively trained and adjusted based on these real scenario data annotated based on words and text bars. From the visual perspective, no matter how language types change, characters are the most fundamental units of words. Character detection methods based on character units are more likely to build a character detection engine under a general-purpose scenario (including horizontal, multi-directional, distorted and transmissive). Hence, a problem currently to be solved is to automatically generate character-based annotation information according to annotation information based on a word, text bar or line in the current annotation data set.