Numerous applications of image processing are aimed at detecting objects of interest in an image or in an image stream acquired by a camera. These applications rely on procedures that can be classed according to two families. The first family of procedures relies on the recognition of shapes. The principle consists in recognizing one or more very specific characteristics of the object sought, for example the contour of a head or the silhouette of a person. The search for these characteristics over the whole scene is a task that is rendered difficult, on the one hand, by geometric deformations due to the optical distortions of the sensors and to the differences of viewpoint of the objects sought and, on the other hand, by occultations between objects sought. By way of example, the silhouette of a person viewed from the front is very different from that of a person viewed from above. The optical distortions are due to the type of camera used. They are particularly pronounced notably for omnidirectional cameras and so-called “fisheye” cameras. Now, shape recognition procedures require training on labellized bases. These bases give examples of people as well as counter-examples as a function of a particular viewpoint and of a given type of camera. Consequently, the configuration of a system for locating objects of interest using a shape recognition procedure is a tricky task, requiring the production of a training base specific to the particular viewpoint of the camera. The second family of procedures for detecting objects in an image is based on a three-dimensional (3D) space optimization criterion. The idea is to maximize, in the image, the overlap between a mask obtained by background subtraction, and the projection of one or more 3D models of the object sought. An example of such a procedure is described in the document Alahi Alexandre, L. Jacques, Y. Boursier and P. Vandergheynst. “Sparsity-driven People Localization Algorithm: Evaluation in Crowded Scenes Environments”, IEEE International Workshop on Performance Evaluation of Tracking and Surveillance, Snowbird, Utah, 2009. A fixed grid of 3D positions on the ground as well as a geometric model of a person, in this instance an ellipsoid representing the upper part of the person and a cylinder for the lower part, are considered in this document. Subsequently, an image in which each pixel takes either a first value, for example ‘0’, or a second value, for example ‘1’, as a function of a parameter of the pixel considered, is called a binary mask. According to the procedure of Alahi, for each position of the grid, a binary mask of the projection of the geometric model, called an atom, is computed. Each binary mask takes the value ‘1’ in each pixel corresponding to the projection of the geometric model in the image, and ‘0’ elsewhere. Locating the people in the image then consists in minimizing the difference between the binary mask obtained by background subtraction and a linear combination of atoms, each atom being either present or absent. Stated otherwise, the procedure consists in searching for the set of positions on the grid giving, by projection of a geometric model at each of these positions, the image most resembling the image in which people are sought. One of the main drawbacks of this procedure is its algorithmic complexity. The search for people is carried out in the image space, thus involving the solving of a linear system whose dimension is equal to the number of pixels in the image, multiplied by the number of positions in the grid. In practice, the procedure requires significant computational resources. Even with a sub-sampling of the image, the procedure is not applicable for real-time processing. Furthermore, the procedure exhibits the drawback of relying on the use of a background subtraction binary mask. But such a mask is liable to fuse disjoint groups of people, for example because of shadows, and to fragment normally joint groups, for example because of clothes whose colors are locally close to the colors of the backdrop. Consequently, the effectiveness of the procedure is limited by that of the background subtraction step.