Lip recognition is usually involved in some Internet scenarios such as an identity authentication scenario. To prevent unauthorized users from cheating an identify authentication system by using a static picture, a video image in which a user talks often needs to be recorded, and then processes such as lip movement recognition is performed on the video image to determine an identity of an authorized user. In the existing technology, one solution of performing lip movement recognition processing on an image is by calculating an area of a lip region in each frame of an image in a video, and then according to a difference between areas of lip regions in frames of images, determining whether a lip movement occurred. Another solution may be by extracting an open/closed state of a lip in each frame of an image in a video, and according to an opening/closing magnitude, detecting whether a lip movement occurred. Both the solutions in the existing technology are based on a lip change magnitude. If the lip change magnitude is relatively small, neither an area change of a lip region nor an opening/closing magnitude of a lip is obvious. As a result, the accuracy of a lip movement recognition result can be improved.