Users are increasingly utilizing electronic devices to obtain various types of information. For example, a user wanting to obtain information about an object can capture an image of the object and upload that image to an identification service for analysis. An identification service can analyze the image to obtain information associated with the object represented in the image. However, it remains a challenge to enable computing devices to identify various objects and/or features from single category (e.g., shoes) of objects as certain categories have a variety of forms, can be present in any viewpoint, can be captured in a wide range of changing conditions (e.g, changes in orientation, image size, shape, etc.), and can suffer from distractors such as background clutter as well as occlusions. These challenges can be further exacerbated as the number of object categories that need to be identified increase. Thus, users can have difficulty locating the appropriate objects, or may at least have to navigate through many irrelevant results before locating the item of interest. Conventional approaches include using more images during the training of such algorithms, training multiple classifiers, as well as developing more advanced recognition methods. Such approaches usually result in an increase in the time taken to train these classifiers and offer no form of corrective course once such classifiers are trained. In such scenarios, adding a suitable pre-processing and post-processing framework can increase the precision and recall of existing approaches with minimal computational overhead.