Anthropological and phylogeographical questions exist as to the origins of various species of beings. Each being carries ancestral material (such as, for example, single nucleotide polymorphisms (SNPs), short tandem repeat (STR) numbers, inversions, etc.) marked by signatures due to imperfections in deoxyribonucleic acid (DNA) replication. Such material tells a story about only a small fraction of the populations that have inhabited the planet (sometimes referred to as the surviving lineages).
One exemplary challenge exists in the form of the ancestor-derivative conundrum, wherein normally the ancestor cannot be distinguished from the derivative, and vice-versa, because, by definition, the two look alike.
Given the collection of mutations (for example, SNPs) in recombining DNA (for example, autosomal chromosomes) of the population of a species, a problem exists in the ability to infer units of recombination (that is, the process by which a strand of DNA is broken and then joined to the end of a different DNA molecule) and the recombination history of each individual. Applied to a population, the problem is, in essence, the ability to infer the ancient recombination graph of the population. The problem is particularly challenging due to recombination.
Existing approaches include combinatorics and statistics such as use counts, log lengths and frequencies. The existing approaches, however, produce signals that are unclear, and do not produce consistent rules obvious for false positives and false negatives.
Existing approaches also include a simple four-gamete rule. Such an approach, however, operates in the absence of recombinations and produces a significant amount of false positives with no apparent rules for eliminating them. Also, existing approaches include linkage disequilibrium (LD) analysis. Such an approach, however, is inadequate for data that includes high linkage disequilibrium.