The Human Genome Project is currently approaching the sequencing phase of the human genome and the completion of this milestone is expected in the year 2005. The hope is that at the conclusion of the sequencing phase, a comprehensive representation of the human genome will be available for biomedical analysis. However, the resulting sequence data from the human genome project will typically correspond to human genomic sequence, and the actual genes represented in the genomic sequence might not be obvious even with the use of sophisticated computer assisted exon identification programs. The availability of cDNA information will therefore significantly contribute to the value of the sequenced human genome since they directly indicate the presence of transcribed sequences. Thus, the sequencing of cDNA libraries to obtain expressed sequence tags or ESTs that identify exons expressed within a given tissue, cell, or cell line is currently in progress. As a consequence of these efforts, a large number of EST sequences are presently compiled in public and privately held databases. However, the present EST paradigm is inherently limited by the levels and extent of mRNA production within a given cell. A related problem is the lack of cDNA sources from specific tissue and developmental expression profiles. In addition, some genes are typically only active under certain physiological conditions or are generally expressed at levels below or near the threshold necessary for cDNA cloning and detection and are therefore not effectively represented in current cDNA libraries.
Researchers have partially addressed these issues by using phage vectors to clone genomic sequences such that internal exons are trapped (Nehls, et al., 1994, Current Biology, 4(1):983-989, and Nehls, et al., 1994, Oncogene, 9:2169-2175). However, such libraries require the random cloning of genomic DNA into a suitable cloning vector in vitro, followed by reintroduction of the cloned DNA in vivo in order to express and splice the cloned genes prior to producing the cDNA library. Additionally, such methods are limited to "trapping" genes having internal exons.