Distance metric learning (DML) is a major tool for a variety of problems in computer vision and other computing problems. As examples, DML has successfully been employed for image retrieval, near duplicate detection, clustering, and zero-shot learning.
A wide variety of formulations have been proposed. Traditionally, these formulations encode a notion of similar and dissimilar data points. One example is contrastive loss, which is defined for a pair of either similar or dissimilar data points. Another commonly used family of losses is triplet loss, which is defined by a triplet of data points: an anchor point, a similar data point, and one or more dissimilar data points. In some schemes, the goal in a triplet loss is to learn a distance in which the anchor point is closer to the similar point than to the dissimilar one.
The above losses, which depend on pairs or triplets of data points, empirically suffer from sampling issues: selecting informative pairs or triplets is important for successfully optimizing them and improving convergence rates but represents a difficult challenge.