Paired-End diTag (PET) directly links the 5′ terminal tags (˜18-20 bp each) of genomic DNA fragments or cDNA molecules to their corresponding 3′ terminal tags for high throughput sequencing. It has led to a number of important discoveries including fusion gene identification (Ng P et al. “Gene identification signature (GIS) analysis for transcriptome characterization and genome annotation” Nat Methods 2005, 2(2):105-111; Zhao X D et al. “Whole-genome mapping of histone H3 Lys4 and 27 trimethylations reveals distinct genomic compartments in human embryonic stem cells” Cell Stem Cell 2007, 1(3):286-298).
A robust method that adopts barcoded adaptors to generate barcoded Paired-End Ditag (bPED) libraries from genomic, chromatin immunoprecipitation (ChIP)-enriched, or transcriptomic sequences has been published in US Patent Application Publication No. 2011/0015096, which is incorporated herein by reference in its entirety. The method has demonstrated how various bPED libraries, each labeled with a unique internal barcode, can be combined to form a multiplex barcoded Paired-End Ditag (mbPED) library for ultra high-throughput sequencing. The advantages of the mbPED approach include: 1) it dramatically simplifies the experimental procedure because multiple bPED libraries can be manipulated as a single mbPED library during sequencing library preparation and sequencing; 2) it is extremely cost effective, especially for sequencing, because sequencing multiple libraries separately would otherwise cost a fortune; 3) it saves time and labor; and 4) it reduces cross-library bias because all bPED libraries in the mbPED library are treated with the same procedure.
However, there remains a need in the art for improving the efficiency of bPED library constructions, especially in connection with the design and preparation of barcoded adaptors.