The search for correlations in many types of data, such as biological data, can be difficult if the data are not exchangeable or independent and identically distributed (IID). For example, a set of DNA or amino acid sequences are rarely exchangeable because they are derived from a phylogeny (e.g. an evolutionary tree). In other words, some sequences are very similar to each other but not to others due to their position in the evolutionary tree. This phylogenetic structure can confound the statistical identification of associations. For instance, although a number of candidate disease genes have been identified by genome wide association (GWA) studies, the inability to reproduce these results in other studies is likely due in part to confounding by phylogeny. Other areas in which phylogeny may confound the statistical identification of associations include the identification of coevolving residues in proteins given a multiple sequences alignment and the identification of Human Leukocyte Antigen (HLA) alleles that mediate escape mutations of the Human Immunodeficiency Virus (HIV).