Randomised-nucleotide fragmentation is an essential process in DNA sequence library construction for the massively parallel short-read sequencing instruments (Knierim et al. 2011). The fragmentation permits generation of random read-initiation points in template nucleic acids. Sequence information of the template nucleic acids may be decoded through computational assembly of the short reads.
Physical shearing is generally recommended by the manufacturers of next-generation massively parallel DNA sequencing systems, due to reproducibility and randomness of fragmentation. For example, the Covaris system uses sound waves to fragment the nucleic acids. However, these systems are time consuming and expensive, and are likely to require the use of dedicated instruments.
The Nextera technology (Illumina) and NEBNext dsDNA Fragmentase kit (New England Biolabs) are alternative random DNA fragmentation methods that only require standard laboratory instruments (Syed et al. 2009a, Syed et al. 2009b; Knierim et al. 2011).
The Nextera technology uses a transposase and transposon complex for random fragmentation of template DNA and attachment of the appended transposon ends at the cleaved sites. The appended transposon end sequences permit PCR amplification and performance of sequencing reaction on the second-generation sequencing systems.
With the NEBNext dsDNA Fragmentase kit, double stranded template DNA is fragmented in two sequential steps; nicks are enzymatically introduced into double-stranded DNA and, the DNA is, then, cleaved at the nicked sites. These enzyme-based methods, however, require DNA sample preparation (buffer replacement and DNA concentration adjustment) for an effective digestion, and the size of generated fragments is sensitive to the DNA sample quality and reaction duration, all of which require optimisation for each sample in order to achieve the desired output.
MspJI is a recently characterized modification-dependent endonuclease (Zheng et al. 2010). This enzyme was identified from Mycobacterium sp. JLS and recognizes CNNR (R=nucleotides G or A) sites, of which the first base is a 5-methylcytosine (mC) or 5-hydroxymethylcytosine, cleaving DNA at N12/N16 bases away from the modified cytosine on the 3′ side. Enzyme activity can be enhanced with the addition of a short double stranded DNA molecule including the MspJI recognition site (enzyme activator), but of insufficient length to be digested. Digestion of a range of genomic DNAs with the MspJI enzyme typically generates 32 to 34 bp fragments, which contain mCpG or mCNG sites central to the fragment. However, to date, this endonuclease has generally been used for detecting methylcytosine bases, detecting changes in methylation status of nucleic acids or assembling nucleic acids. For example, the methylation status of human genome has been analysed through sequencing of the 32 to 34 bp fragments (Cohen-Karni et al. 2012).
It is an object of the present invention to overcome, or at least alleviate, one or more of the difficulties or deficiencies associated with the prior art.