Cell-free DNA (cfDNA) in the circulation is typically fragmented (typically in the range of 100-200 base pairs in length), and thus methods for cfDNA analysis have traditionally focused upon biological signals that can be found with these short DNA fragments. For example, detecting single-nucleotide variants within individual molecules, or performing ‘molecular counting’ across a large number of sequenced fragments to indirectly infer the presence of large-scale chromosomal abnormalities e.g. tests for foetal chromosomal trisomies that assess foetal DNA within the maternal circulation (a form of so-called ‘non-invasive prenatal testing’, or NIPT).
A large variety of methods to analyse circulating cell-free DNA have been described previously. Depending upon the specific application area, these assays may employ different terminology for a broadly similar set of sample types and technical methods, such as circulating tumour DNA (ctDNA), cell-free foetal DNA (cffDNA), and/or liquid biopsy, or non-invasive prenatal testing. In general, these methods comprise a laboratory protocol to prepare samples of circulating cell-free DNA for sequencing, a sequencing reaction itself, and then an informatic framework to analyse the resulting sequences to detect a relevant biologic signal. The methods involve a DNA purification and isolation step prior to sequencing, which means that the subsequent analysis must rely solely on the information contained in the DNA itself. Following sequencing, such methods generally employ one or more informatic or statistical frameworks to analyse various aspects of the sequence data, such as detecting specific mutations therein, and/or detecting selective enrichment or selective depletion of particular chromosomes or sub-chromosomal regions (for example, which might be indicative of a chromosomal aneuploidy in a developing foetus).
Many of these methods are for use in NIPT (e.g. in U.S. Pat. Nos. 6,258,540 B1, 8,296,076 B2, 8,318,430 B2, 8,195,415 B2, 9,447,453 B2, and 8,442,774 B2). The most common methods for performing non-invasive prenatal testing for the detection of foetal chromosomal abnormalities (such as trisomies, and/or sub-chromosomal abnormalities such as microdeletions) involve sequencing a large number of molecules of cfDNA, mapping the resulting sequences to the genome (i.e. to determine which chromosome and/or which part of a given chromosome the sequence derive from), and then, for one or more such chromosomal or sub-chromosomal regions, determining the amount of sequence that maps thereto (e.g. in the form of absolute numbers of reads or relative numbers of reads) and then comparing this to one or more normal or abnormal threshold or cutoff values, and/or performing a statistical test, to determine whether said region(s) may be overrepresented in amount of sequence (which may, for example, correspond to a chromosomal trisomy) and/or whether said region(s) may be underrepresented in amount of sequence (which may, for example, correspond to a microdeletion).
A variety of additional or modified approaches to analyzing cell free DNA using data from unlinked, individual molecules have also been described (e.g. WO2016094853 A1, US2015344970 A1 and US20150105267 A1).
Despite the existence of such a wide range of methods, there remains a need for new methods of analysing cfDNA that would allow the reliable detection of long-range genetic information (e.g. phasing) and also for methods with greater sensitivity. For example, in the case of NIPT, foetal cfDNA only represents a minor fraction of the overall cfDNA in pregnant individuals (the majority of circulating DNA being normal maternal DNA). Therefore, a considerable technical challenge for NIPT revolves around differentiating foetal cfDNA from maternal DNA. Similarly, in a patient with cancer, cfDNA only represents a tiny fraction of the overall circulating DNA. Therefore, a similar technical challenge exists in relation to the use of cfDNA analysis for the diagnosis or monitoring of cancer.