With the digitization of media resources and the rapid development and application of the Internet in recent years, acquisition as well as exchange and transmission of digital images over networks become very easy and common, which also facilitated information hiding based on digital images. Steganography is used for embedding secret information into a normal carrier without changing the perceptual characteristics of the carrier, thereby realizing secret transmission of information. With the vigorous development of the information hiding technologies, a lot of steganography methods have emerged. People can conveniently obtain and use various steganography tools to communicate messages over the Internet. However, the misuse of the steganography has caused increasingly prominent information security problems and has brought about potential serious dangers to the country and society. Hence, there is an urgent need for digital image steganalysis technologies. The purpose of digital image steganalysis is to determine whether an image contains extra secret information through analyzing the image data, and it can even estimate the amount of information embedded, estimate the secret key, acquire the secret information, etc. By means of image steganalysis, images containing hidden information can be found, so the use of steganography can be monitored effectively and illegal use of steganography can be prevented, which are significant for network information security.
Currently, there are mainly two types of steganalysis technologies, i.e. a specialized method for some specific steganography tools or a certain type of embedding technology and a universal method that is not exclusive for any specific embedding method. The specialized method usually has a high detection rate, but it is not practicable, because it is impossible to exhaust all hiding algorithms in practical application. Meanwhile, new steganography algorithms continuously emerge. Hence, universal steganalysis becomes more and more important, and study on method of this type has been significantly strengthened in recent years. The universal steganalysis technology, which is also called blind detection technology, is usually viewed as a binary classification problem to distinguish between covers and stegos. Most of existing steganalysis approaches follow a conventional paradigm based on machine learning, which consists of feature extraction and classifier training steps. The detection accuracy of present universal steganalysis methods mainly depends on handcrafted feature design. In the current field of image steganalysis, there are many methods for feature design, typical ones are described, for example, in [J. Fridrich and J. Kodovsky, “Rich Models for Steganalysis of Digital Images,” IEEE Trans. on Info. Forensics and Security, vol. 7(3), pp. 868-882, 2012] and [V. Holub and J. Fridrich, “Random projections of residuals for digital image steganalysis,” IEEE Transactions on Information Forensics and Security, vol. 8, no. 12, pp. 1996-2006, 2013.] In these methods, the design and selection of features are heavily dependent on specific data sets, and require a lot of time and energy, and they have high requirement on the experiences and knowledge of people. In practical application, the complexity and diversity of real image data have brought more challenges to feature design.
In recent years, with the development of deep learning, automatically learning features from image data using deep learning has gained extensive attention and has been widely applied in areas like recognition and classification. Deep learning is a class of machine learning methods that addresses the problem of what makes better representations and how to learn them. The deep learning models have deep architectures that consist of multiple levels of non-linear processing and can be trained to hierarchically learn complex representations by combining information from lower layers. Moreover, a deep learning model unifies feature extraction and classification modules under a single network architecture, and jointly optimizes all the parameters in both modules. A typical deep learning method is described, for example, in [Hinton G E, Salakhutdinov R R. “Reducing the dimensionality of data with neural networks,” Science, 2006, 313(5786): 504-507.] and [Krizhevsky A, Sutskever I, Hinton G E. “Imagenet classification with deep convolutional neural networks,” Advances in neural information processing systems. 2012: 1097-1105].