An exemplary embodiment relates to the field of Automatic Face Recognition (AFR) systems. More specifically one exemplary embodiment relates at least to a method and a system capable of recognizing the face of a person using a device equipped with a camera of any kind and an associated computer, such as an embedded computer. The system is alternatively suitable to be implemented as an embedded system with minimal processing hardware capabilities, consuming very low power.
Automatic Face Recognition is an important part of understanding video content, and plays a significant role in many modern systems, including personal computers (PCs), stationary or portable digital entertainment systems, and mobile devices such as smartphones, tablets, etc.
There are many approaches for performing face recognition. Most of these approaches are based on a Personal Computer (PC) to carry out the required processing tasks. In such systems, a video digitizer samples the camera sensor which is then processed by the face recognition software running on the PC.
Recognition accuracy is a key aspect when it turns to face recognition systems. The system needs to be very accurate in this task, recognizing among several registered (enrolled) users the right person with high success rate and at the same time rejecting any unenrolled person also with high success rate.
Particularly on mobile systems, where the face recognition functionality is intended to be used by a security module for device security—locking and unlocking the device with face recognition—the recognition accuracy is of paramount importance. However, this particular use case poses additional challenges since the acquired facial images suffer from pose and illumination variations. These challenges further complicate and present technical problems for the face recognition system. Therefore, in these cases, a trade-off emerges between face recognition accuracy on one hand, and fast response time/low-power consumption on the other.
Recently, a new class of face recognition systems has emerged known as deep-learning systems (Y. Taigman, M. Yang, M. Ranzato, and L. Wolf. Deep-Face: Closing the gap to human-level, performance in face verification. In Proc. CVPR, 2014—incorporated herein by reference in its entirety). These systems use a Convolutional Neural Networks (CNN) approach in order to achieve high face recognition accuracy and quality. A CNN is a system that is able to “learn” to recognize a specific data pattern through a repetitive process of processing, using annotated data and adapting its parameters towards minimizing a cost function. Its ability to learn robust feature representations has proved to be a very powerful technique in many modern machine learning problems and especially in computer vision.
However, when a face recognition system is to be used for face recognition on a mobile device, apart from recognition accuracy, recognition speed and low power consumption are also very important features. The system should be able to respond quickly and consume low power in order to comply with the limited power budget of a modern mobile device. Processing speed and power consumption depends both on the algorithm complexity and the processor computing capacity.
Nevertheless, besides the worth-mentioned technological developments in the field of processing hardware, the computing capacity of modern mobile processors cannot cope with the complexity of the modern state-of-the-art face recognition algorithms and in particular with the deep-learning based systems referred to above.