Cosmetic products1 are used to enhance or alter the appearance of a person's face or other body parts for the purposes of beautification. Altering the appearance of a face by applying makeup is a time-consuming task requiring the use of expensive cosmetic products, specialized applicators and expertise. The same applies to trialling other cosmetic products such as nail varnish. Consumers may wish to apply many different cosmetic products and view them in different combinations before selecting a desired look. They may also wish to view the look in a variety of facial expressions and poses to aid their evaluation, for example by changing their facial expression and pose and whilst viewing their face in a mirror, or moving their hands to view a nail polish in different lights. This immediate visual feedback is crucial in enabling the consumer to form opinions and make refinements to their makeup selection. The physical application of many different makeup looks is a time-consuming and costly process and, in addition, cosmetics need to be removed between applications, thereby further increasing effort and cost. Despite these negative factors, the value that physically trying on cosmetic products adds to a consumer's purchasing decision means that it remains an extremely popular method of purchasing these products. 1 No distinction is made among the terms “makeup,” “cosmetics,” “makeup cosmetics,” and “cosmetic products,” all of which are used interchangeably herein.
While, for heuristic convenience, the present application may refer to makeup and its application to the face, it is to be understood that the teachings herein apply in equal measure to any cosmetic product applied to any body part. Similarly, references to ‘makeup’ subsume other cosmetic product such as nail varnish, skin tanning lotion, permanent makeup and tattoos.
Methods of synthesizing the appearance of a cosmetic on a subject's body have the potential to reduce the cost and time required to visualise different looks. This potential can only be realised if the synthesized appearance gives a true and realistic representation of the physical product when applied to the body of the subject and is achievable in a natural, accessible and immediate way. The potential of any makeup simulation system is reduced if the consumer faces technical or usability barriers, such as a lengthy and difficult setup process or a complex interface.
An ideal system of makeup simulation must have no difficult or time-consuming setup, must simulate accurate and realistic makeup appearance, and should be deployed on widely available and convenient consumer hardware. Moreover, it should also allow the subject to view the synthesized appearance in a variety of poses and expressions naturally, in real-time and with little effort. To maximize the accessibility of such a system to consumers, the system must also be capable of performing in an arbitrary consumer environment, such as in the home or outdoors, and must therefore be able to operate with very few constraints on lighting conditions, image resolution or the subject's position in the image. This scenario is referred to herein as an “unconstrained consumer capture environment.”
Methods heretofore developed for synthesizing makeup looks have often been limited to single image systems. The dependence on a single static image severely limits the subject's ability to visualize the appearance of the physical makeup in a natural way by removing their ability to experiment with expression and pose. For example, U.S. Pat. No. 6,937,755 to Orpaz et al., entitled “Make-up and fashion accessory display and marketing system and method”, describes a system which is based on a single template image. Furthermore, the system is at best semi-automatic and requires a lengthy preparation for each template image before the subject can achieve a simulated result. A further example of this type of system is described in U.S. Pat. No. 8,265,351 to Aarabi, entitled “Method, system and computer program product for automatic and semi-automatic modification of digital images of faces”. Aarabi also describes a single-image-based approach and teaches another semi-automatic method based on detecting regions of interest on the image of the subject's face. Aarabi also teaches that user feedback is required to fine-tune the features and the resulting synthesised images, which again is a complex and time consuming task for a consumer to perform. Any methods that require even a small amount of user refinement per image are not appropriate for a real time video based simulation.
In addition to the prior-art references listed above, there are several systems provided on the Internet that allow the modification of single images of a subject using manual or semi-automatic analysis, for example EZface™, MODIFACE™ and others. These have been fairly widely deployed but have had limited impact on consumer behaviour due to their relatively complex setup and usage requirements. As previously stated, these systems are fundamentally limited to a single image of the subject, and so fail to provide a natural equivalent to the physical process of trying on makeup products such as viewing applied makeup in a mirror. The systems are also somewhat limited in the images that they are able to process, requiring good lighting and high-resolution images as input which is a further barrier to wide-scale adoption in the consumer space.
More recently real-time video modification systems have become available, for example Google+ Hangouts, Logitech Video Effects and others. These systems apply overlays and effects to video streams in real-time, however none of them are aimed at, or achieve, a simulated makeup effect at the level of realism and accuracy required for a virtual try-on application. The method and apparatus described in U.S. Pat. No. 8,107,672, to Goto, entitled “Makeup simulation system, makeup simulator, makeup simulation method, and makeup simulation program,” teaches a system for real-time simulation of makeup on a video sequence of a subject's face. This system is intended for deployment in a controlled capture environment such as a makeup counter in a shop, and is implemented on custom hardware. The system is based on the automatic recognition of a set of pre-defined facial features, however the description fails to teach the method by which this is achieved. In an unconstrained consumer capture environment, lighting, head pose, occlusion and other factors mean that a pre-determined set of features will not be visible on each frame of video in general, which severely restricts the system's utility outside of a controlled environment. In common with other prior-art, the system also fails to teach methods of realistic simulation of makeup in unconstrained lighting conditions, which is required when deploying a system in an unconstrained consumer capture scenario. As the system requires specific hardware, consumers would be required to travel to key locations in order to use the system, this is a major disadvantage. A system that utilised standard consumer hardware would mean consumers can try on makeup, at their convenience, wherever they are located.
In view of the foregoing, methods are needed that simulate realistic makeup appearance in a manner free of environmental constraints.