Over the years, remote sensing image classification has played an important role in many applications such as environmental damage assessment, crop growth regulation, land use monitoring, urban planning and reconnaissance. Compared with a single-band full-color image and a multispectral image, a Hyperspectral Image (HSI) can be used for detecting and distinguishing objects with higher accuracy because HSI has higher spectral resolution.
In a hyperspectral image, the spectral data of each pixel is a high-dimensional vector, and hundreds of data dimensions represent the spectral response of hundreds of bands. Hyperspectral image classification is mainly to classify each pixel based on spectral information. To achieve this goal, many pixel-level classifiers have been developed, including Support Vector Machine (SVM), support vector condition stochastic classifier, neural network, etc. Although these classifiers can make full use of the spectral information of HSI, they have not taken into account the spatial context, so noise often appears in the classification results. To solve this problem, and as a result that the pixels in a local area usually represent the same material and have similar spectral characteristics, many methods to obtain classification effect by integrating the information of near space have been developed. However, due to the lack of understanding of the near area, some risks exist in this kind of rough near area selection. Therefore, object-level classification appeared later. Although ground objects are presegmented in object-level classification, the classification effect is not ideal due to the existence of the problem of undersegmentation.
Inspired by the sparse coding mechanism of human visual system, Bruckstein first proposed the concept of sparse representation. In the field of hyperspectral image classification, the research of sparse representation is mainly focused on the acquisition of overcomplete dictionary and sparse solution:
In the field of sparse representation, the completeness of the sparse representation of the original signal is guaranteed by the overcomplete dictionary. The acquisition methods of the overcomplete dictionary are mainly divided into two types: methods based on mathematical models and methods based on training samples, wherein a dictionary acquisition method based on training samples inherently contains rich original signal characteristics, and direct splicing of the original signal is the most classical overcomplete dictionary based on training samples. However, due to the randomness of training samples and the rigidity of the dictionary, the completeness of the dictionary cannot be verified or improved. Therefore, dictionary learning methods have been proposed, among which, K-SVD dictionary learning method aims to minimize signal reconstruction errors, and dictionary update and sparse coding are carried out alternately through Orthogonal Matching Pursuit (OMP) and Singular Value Decomposition (SVD). Although the K-SVD method has strong universality and popularity, it has not emphasized the characteristics between different classes in the process of classification application.
For the researches on solving the sparse representation of the original signal, the most classic ones are Matching Pursuit (MP) and Orthogonal Matching Pursuit (OMP). In MP and OMP, the solution of sparse representation is based on a signal (pixel) and the influence of spatial context is not considered. Based on this, Joint Sparse Model (JSM) and Synchronous Orthogonal Matching Pursuit (SOMP) appeared subsequently. However, many problems in the selection of near space still exists in these two algorithms: on one hand, the shape of the near space is rectangular, the ground objects in the rectangular window are unknown, and the ground objects in the rectangular window are assumed to be unified in the algorithms, so this assumption becomes very dangerous when the scale is large; on the other hand, the scale of the near space area is single and needs to be set in advance, and different application environments have different optimal scales, so it is very difficult to configure this scale.
Therefore, how to provide a superpixel classification method based on semi-supervised K-SVD and multiscale sparse representation is an urgent problem to be solved by those skilled in the art.