Food recognition is a relatively new research field, which uses techniques of visual pattern recognition to identify a picture of food in a plate of food.
In the problem of food recognition, the great challenge is to target and identify the various foods on the plate considering environmental issues such as changing lighting, great variability of patterns even within a class of foods (e.g. green, red or black beans, roasted beef, with different sauces on top, cooked, etc.) and possible mixing. Even using known methods of image segmentation and pattern classification, it is still difficult to obtain a high recognition of objects when the problems mentioned are present.
The document titled: “Food Recognition Using Statistics of Pairwise Local Features,” by Yang et al, published in IEEE Inti. Conference on Computer Vision and Pattern Recognition, published on Jun. 13, 2010, uses a data set of the document entitled: “PFID: Pittsburgh Fast-Food Image data sets,” Inti. Conference on Image Processing, published on Nov. 10, 2009, by Chen et al to assess the quality of their food recognition system. It is important to note that the data sets of food images are eminently snacks, making it easier to be recognized as belonging to classes with more structured standards, for example, sandwiches, French fries, etc.
The patent document U.S. 2003/0076983, entitled: Personal Food Analyzer, published on Apr. 24, 2003, proposes a system for identifying foods in the image. This patent document employs a system of lights mounted on different angles on the portable device in which the food recognition system can be found. From a central camera, with the lights at its ends, two images illuminated by each of the two light sources are captured sequentially. With the two images, the foods are identified, their volumes estimated and their calories obtained from a database of nutritional information of pre-registered food. The two sequential images are used to target the food and estimate the volume of each food from the outline of its shadows. Soon after, only the first image is sent to the food identifier, consisting of a food reference tree. The reference tree is used to estimate the type of food from similar characteristics of color, shape and size. From the description of this patent document, it is understood that the method of identification of too is not robust to variations in food (and even lighting), since they expect food colors, shapes and sizes previously estimated, not being adaptive. In addition, the automatic decision by a plate of food can lead to huge errors, not allowing the user to change the type of food, if it is misidentified. Nothing is reported still on the automatic method to segment the image of the plate of food. Also, there are no details on the method of identifying food.
The patent document 2010/0173269, published on Jul. 8, 2010, makes use of voice and visual patterns to identify a picture of food. In front of a picture of food taken by the user, the latter presents a description of each item in the plate, along with visual patterns, the modules of voice recognition and image recognition identify each food in the picture. The voice recognition step can help to identify any problems with it. With regard to the method proposed in this document for visual pattern recognition, it also employs characteristic of colors and textures, and classify them by support vector machines. The food colors are featured in the color space CIE L*a*b* and for textures are used the histograms of oriented gradients. It is known that the characterization of an object by only basic features, even if the concatenation of two or more characteristics is insufficient to determine the wide variation in standards of food in an image, and so limitations to identify are expected. Integration with voice recognition by identifying a list of characteristics of the food plate in the photo must therefore distinguish a series of visual limitations of the proposed methods.
The patent document U.S. 2010/0111383, published on May 6, 2010, proposes a system to record calories from automatic recognition of pictures of food taken by a portable camera. The photo of a plate of food is taken before and after eating, for the calculation of food eaten. The proposed application provides several ways to do it, as manually from the photo of captured food up to automatically from lists of suggested foods for each food recognized segment of image. The automatic recognizer presented in this document is executed on a computer, causing the portable device to be connected via the network to this server application. Thus, after the picture is taken, it is sent to the server, where an idealized system of pattern recognition is performed to segment each type of food. The requirement to be connected to a computer network makes use of the method proposed by the document U.S. 2010/0111383 limited to local access network and the existence of a network computer application for recognition. In addition, the use of Gabor filters also makes high computational cost for responding to input images, which can annoy most users of portable devices. Additionally, the segmentation based on threshold methods does not make the application suitable for all types of lighting problems.
Therefore, to address these problems, the present invention provides a method for food recognition by extracting features based on colors and textures. The first, we use the color space CIE L*a*b*, HSV and RGB in order to represent the colors of the foods in way that it is as invariant as possible to different types of lighting environment. With the concatenation of robust features of textures using Gaussian difference and coefficient of spatial variation, it was possible to obtain multiple spaces of hypotheses which are classified by robust classifiers based on support vector machines (SVM) with radial based kernel-type function. By means of multiple ‘classification hypotheses, it is obtained therefore a method capable of dealing with the problems mentioned above.
Despite technological advances, portable devices still have relatively limited hardware resources to run algorithms that perform advanced calculations. To address these limitations, the present invention applied parallel execution techniques aiming to reduce the computational cost, allowing a reduction in decode time and minimizing the use of the processor of the portable device. These techniques also provide an increased robustness of the food identifier, since it was possible to use stronger methods without increasing the computational cost, besides allowing the identification in a timely manner.
To prevent the image from the plate of being captured in a distorted or unfocused, auto-focus of the device is triggered, preventing this problem.
In order to increase the robustness of the identification of the object and its corresponding segmentation in the image, we used a combination of the method of segmentation and identification so that the two work in order to help each other. The segmentation method operates in three different and robust color spaces under certain image conditions. Then, color and texture features are extracted for each segment-hypothesis established by the segmenting method, and a probability of being a list of foods is listed for each data segment. A reclassifier based on contexts previously listed is used to reorder the list provided by the classifier of patterns. The choice of the segment and list of foods with a better chance is finally provided to the end user, so that he can reaffirm the best chance of the list or simply choose a food that best identifies the image on the plate. With this, the problems on variation of the same pattern in the image are handled due to different lighting environments in which the pictures will be taken. The volume of each food on the plate is estimated by approaching its area. At the end, the calorie relationship between volume and type of food is obtained through a pre-registered table of values.