Determining the copy number of a target nucleic acid in a sample from a subject can provide useful information for a variety of clinical applications. However, some assays for determining copy number of a target nucleic acid can underestimate the copy number of the target nucleic acid if multiple copies of the target nucleic acid are on the same polynucleotide in the sample. For instance, in a digital polymerase chain reaction (dPCR), a spatially isolated polynucleotide comprising two copies of a target sequence can be counted as only having one copy of the target nucleic acid. There is a need for improved methods, compositions, and kits for copy number estimation of nucleic acid target sequences in assays, such as dPCR assays, that take into account the linkage of target nucleic acid sequences (e.g., whether multiple copies of a target nucleic acid sequence are on the same polynucleotide in a sample).
Knowledge of haplotypes or phasing of neighboring polymorphisms can be useful in a variety of settings. Humans are diploid organisms, because each chromosome type is represented twice as a pair of individual chromosomes in each of a person's somatic cells. One copy of the pair is inherited from the person's father and the other copy from the person's mother. Therefore, most genes exist as two copies in each diploid cell. However, the copies generally have many loci where sequence variation occurs between the copies to form distinct sequences known as alleles.
It can often be useful to know the pattern of alleles, the haplotype, for each individual chromosome of a chromosome pair. For example, if a person has inactivating mutations at two different loci within a gene, the mutations may be of limited consequence if present together on the same individual chromosome, but could exert a major effect if distributed between both individual chromosomes of a chromosome pair. In the first case, one copy of the gene is inactivated at two different loci, but the other copy is available to supply active gene product. In the second case, each copy of the gene has one of the inactivating mutations, so neither gene copy supplies active gene product. Depending on the gene in question, mutation of both copies of the gene could lead to a variety of physiological consequences including non-viable phenotypes, increased risk of disease, or inability to metabolize a class of medications, among others.
Information on haplotype can be useful information for life sciences applications, the medical field, and in applied markets, such as forensics. The issue of haplotype determination, arises in many contexts of human (and non-human) genetics. For example, many genetic associations are tied to and thus predicted by haplotypes. The HLA (human leukocyte antigen) region is one prominent instance where particular genetic diseases have been associated with various haplotypes of the major histocompatibility complex.
Conventional genotyping technologies interrogate different loci of sequence variation, such as SNPs (single nucleotide polymorphisms), in isolation from one another. Thus, the technologies can confidently determine that a pair of distinct alleles is present at each of two linked loci in a sample of genetic material. However, the technologies cannot tell which combination of alleles from the two loci is located on the same chromosome copy. A new approach to determining haplotypes is needed to overcome this obstacle.
Similarly, new methods are needed for determining the probability of fragmentation between two target nucleic acid sequences in a sample, determining levels of degradation in a sample of nucleic acids (e.g., DNA, RNA), assessing alternative splicing, or detecting inversions, translocations, or deletions.