Identification of a biological sample using DNA profiles is an important task in forensic science. For example, the terrorist attacks of Sep. 11, 2011 placed huge demands on forensic scientists to identify human remains from the collapsed World Trade Center buildings. In light of these demands, forensic scientists need more efficient and more accurate search methods to assist in identifying biological specimens by using DNA typing technologies to obtain DNA profile data.
Ideally, a forensic scientist obtains a DNA profile from a sample obtained from a personal effect of a missing person such as a toothbrush, razor, or comb, and searches for a match in a database containing DNA profiles from unknown biological specimens of a missing person or victim's remains. In theory, this approach can identify the missing person, but in practice, this approach breaks down when it encounters samples with partial profiles or when the reference origin of the personal effect cannot be obtained or verified. It is common to obtain incomplete DNA profiles from disaster areas due to harsh environmental conditions where the DNA integrity has diminished. This forces forensic scientists to lower the match stringencies within database search engines, yielding potentially numerous false positives. In addition, incorrectly labeled personal effects can lead to inaccurate identifications.
When direct searching fails, identification using kinship analysis is often necessary. Kinship analysis comprises possibly narrowing the scope of a search by using any available DNA or non-DNA information to exclude unrelated specimens and then calculating genetic relatedness to at least one biological relative of a missing person. For example, the technology used for kinship analysis after the World Trade Center disaster of Sep. 11, 2001, relied on pair-wise comparison of a test DNA profile from an unknown biological specimen to a target DNA profile from a known biological relative, taking into account various familial relationships such as parent-child, sibling or half-sibling, and calculating the value of a function that indicates the likelihood or probability that the relationship is true (e.g., Cash et al., genecodesforensics.com/news/CashHoyleSutton.pdf, 2003). A likelihood ratio is commonly used, which indicates the likelihood that the given DNA profiles of the two samples would be obtained if they are related, relative to the likelihood or probability that these DNA profiles would be present if the individuals were unrelated. A measure of genetic similarity can also be used to indicate the likelihood that a relationship is true. Such a measure can, for example, account for shared DNA alleles, loss of genetic information through degradation of the DNA, or the possibility of mutation of an allele. For any of these functions, the specimens are then independently sorted according to the function's value. When a likelihood function, such as probability, likelihood, or likelihood ratio is used, the specimens are sorted according to the calculated likelihood function value that the DNA profile from an unknown biological specimen is related to the DNA profile from a biological relative. Unfortunately, this approach is cumbersome and imprecise for large cases, such as the World Trade Center disaster, because each search is for a specimen which is related to a single family member. A pair-wise comparison to the DNA profile of a single known relative can produce a large collection of candidate profiles. Human analysts must then sort, correlate, and analyze the matches, possibly manually with available meta data, which is a very labor intensive and time consuming process.
Software tools exist which allow the correlation of DNA match results from a single type of DNA profile, such as short tandem repeat (STR), single nucleotide polymorphism (SNP), mitochondrial DNA (mtDNA) and Y-STR DNA, among others. Technologies are needed which can use all available DNA profile information involving a missing individual or an unknown biological specimen and his/her relatives to further enhance the ability to make an accurate identification.