The genealogical and historical documents communities presently use manual labor to cluster historical images. The clustering (or classifying) of historical images includes applying labels to images with similar content, such as applying a label of “census” to all images of census records, or applying a label of “gravestone” to all images depicting a gravestone. Labels may vary over a wide range of specificity, from relatively generic labels (e.g., “photo”) to more specific labels (e.g., “photo of woman holding baby”). Because genealogical databases often contain huge amounts of historical documents (on the order of billions), new approaches for clustering and classifying historical images are needed.