Recent developments have lead to an increase in sequencing output. It is now possible to determine the entire genetic code of an organism. There are two general methods used for genomic sequencing: a BAC-by BAC approach, and whole genome shot-gun sequencing.
The high copy pUC based plasmids and medium copy pBR based plasmids and other standard vectors that are used in cloning can generally accommodate inserts in the range of 2 kb and 10 kb, respectively. Derivatives of pUC, such as pUC18, can accept sizes up to perhaps 4 kb to 5 kb without instability. Derivatives of pBR, such as pBR332, can take inserts up to about 15 kb without instability. Previous work has shown that such vectors are not readily usable for larger inserts, such as inserts of 25 kb, 40 kb, 50 kb and 60 kb. Such insert sizes are generally not stable, and this instability worsens as the insert size increased. In addition, there is generally a marked variation in colony size and plasmid preparations show wide variation of insert sizes and skewing to lower molecular weight, as well as some vector without inserts.
Shot-gun sequencing and assembly methods that have been developed rely on the use of end sequence reads of cloned inserts of approximately 2 kb and 10 kb as well as end sequence reads from BAC clones (150 kb). The use of these different size fragments provides sequence distance anchors that can be used to assemble a genome from the sequence reads. (Myers, et al., Science March 24;287(5461):2196-204 (2000); Weber and Myers, Genome Res. May;7(5):401-9 (1997))
One of the limitations in the shot-gun approach is the need to produce a set of BAC clones that tile the genome of the organism. This process is both time consuming and expensive. There is therefore a need in the art to develop an alternative to BAC end sequencing as it is applied to genome sequencing and assembly.