Same to other biological features (such as a fingerprint and an iris) of a human body, a face and an identity of a person have a strong correspondence. A desirable property that a face is difficult to duplicate provides a necessary basis for identity authentication. Compared with other types of biological identification methods, a facial feature identification method has the following advantages: non-compulsory and non-contacting. Different from iris, fingerprint, and other authentication methods, the facial feature identification method does not require cooperation of a user, and can acquire a facial image of the user at a long distance to perform identification; and a single device may collect multiple faces at the same time and perform concurrent operations.
There are mainly two different application scenarios for facial feature identification technologies. One is face identity verification, and the other is face identity identification. The face identity verification refers to that two facial images are given, and it is determined whether identities of persons in the two facial images are of a same person. The face identity identification refers to that a database including multiple faces and corresponding identities is given, for a specified face, it is determined whether the specified face is of a same identity as a face in the database, and if yes, identity information of the face is given. Generally, the face identity verification is the basis of the face identity identification. A task of face identity identification can be completed by verifying, one by one, whether a specified face is of a same person as a face in a database.
In a process of face identity verification, a facial image becomes a sample x after certain preprocessing. In a case in which a sample set x1, . . . , xK is given, a discriminant function f(•,•) is obtained through training by using a machine learning method. For two specified faces, two face samples y1, y2 are obtained, and whether the two face samples y1, y2 are of a same person or different persons is determined according to a value obtained by f(y1, y2). A common method is that when f(y1, y2)>0, the samples represent a person of a same identity; otherwise, the samples represent persons of different identities.
At present, there are many mainstream face identity verification technologies, and two of the face identity verification technologies are most successful at present: (1) a Bayesian modeling method, and (2) a deep learning method.
A main method of a Bayesian model includes the following basic parts: a model learning phase and a model testing phase. The model learning phase includes the following several steps. Prepare a training sample set: collecting a facial image set V={v1, . . . , vM}, where the M images come from N persons of different identities. Generally, M>>N, each person corresponds to multiple different images in the image set, and M and N are both positive integers.
A sample set X={x1, . . . , xM} is constructed by using the image set V, where xi=g(vi), and g is a functional transformation, and an image vi is transformed into a digital vector xi. Generally, g includes image preprocessing, for example, i) extracting a facial area from an image, ii) performing an operation of face alignment, and iii) extracting a particular feature from an aligned facial image area.
A face verification training sample set Δ={δ1, δ2, . . . , δL}. is constructed by using the face sample set, where δj=xa−xb, xa, xbεX.
A random variable that corresponds to a difference between two facial features is denoted as δ=x−y. Probability models p(δ|ΩI), p(δ|ΩE) are obtained by learning the sample set Δ. Herein, ΩI and ΩE separately represent assumptions whether δ is an intra-class change (a change of a person of a same identity shown in different images), or an inter-class change (a change of persons of different identities shown in different images). In a normal Bayesian model, p(δ|ΩI), p(δ|ΩE) are both preset as Gaussian distribution models, and an objective of model learning is to obtain parameters of the two Gaussian distribution models.
After the Bayesian model is obtained, steps of testing include the following. For two given images vα, vβ, digital vectors xα, xβ are obtained after the same functional transformation g as in the training phase. δαβ=xα−xβ,
      S    ⁡          (              δ        αβ            )        =      log    ⁡          (                        p          ⁡                      (                                          δ                αβ                            |                              Ω                I                                      )                                    p          ⁡                      (                                          δ                αβ                            |                              Ω                E                                      )                              )      is calculated by using the probability models p(δ|ΩI), p(δ|ΩE).
If S(δαβ)>0, the samples come from a person of a same identity; otherwise, the samples come from persons of different identities.
The foregoing descriptions are application steps of a classical Bayesian model. The classical Bayesian model has the following several obvious defects. The model is based on a difference between feature representations of two input faces, some discrimination information is lost, and distinguishability of samples is reduced. p(δ|ΩI), p(δ|ΩE) are assumed as Gaussian models, which is a kind of excessively simplified processing in actual use. A Gaussian model cannot entirely process a difference of posture, illumination, expression, age, blockage, hairstyle, or the like between different facial images.
When the deep learning method is used for face identity verification, an effective digital feature expression of a face is mainly learned by using a deep network, that is, a function g is simulated by using the deep network, and an input (an original facial image or an extracted feature) of a neural network is transformed into a more effective digital feature x, so as to facilitate further identity verification. For two input images vα and vβ, digital feature expressions xα and xβ are obtained by using a deep neural network. After xα and xβ are obtained, xα and xβ are used as inputs. Multiple classification methods may be used to map a feature pair (xα, xβ) to two types {a same person, different persons}. For example, the above-described Bayesian method may be used for implementation, or a relative simple classification method, such as Soft-max or support vector machines (SVM), may be used to classify features.
The deep neural network model has an advantage of desirable discrimination performance, but also has very serious defects. Main defects of the deep neural network model lie in two aspects. The model is highly complex, and the model has a large quantity of parameters, which are inconvenient to store. A quantity of model parameters generally involved is about 12 million. A calculation amount for testing is also large, making implementation at a terminal difficult. An excessively huge amount of training data is needed, which is a common problem for a deep-learning-based technical framework. A quantity of marked images involved in training reaches millions. Lots of manpower and material resources are needed to collect, mark, and check related image data.