Deep learning networks for image segmentation include FCNs (Fully Convolutional Networks) such as U-NET, SEGNET and the like, which can achieve pixel semantic segmentation in some natural scenes. In the currently existing segmentation model based on deep learning, the final pixel classification is defined according to the general classification model (cross entropy) loss. Generally, the classification model requires that the number of samples in each category is as equal as possible. If the number of samples under one or more categories is much larger than that of samples under other categories, that is, the categories are unbalanced, the learning effect is often not satisfactory. For example, in a chart image, the background pixels occupy the vast majority of the area, while the title text only accounts for a very small proportion, then a very unbalanced state of categories occurs, e.g., if a standard multi-category cross entropy loss function is adopted for classification, the classification model tends to classify all pixels into the background, resulting in inaccurate pixel segmentation.