1. Technical Field
The invention relates to methods and materials involved in identifying and isolating a nucleic acid molecule that contains an open reading frame.
2. Background Information
The genomes of higher organisms such as most crop and livestock species as well as the human genome are complex and contain greater than 90% non-genic sequences. In such cases, genes have been identified by cloning mRNA species as cDNAs into plasmid vectors to form a cDNA library. The cDNA library is then analysed for the presence of open reading frames, regions of polynucleotides that encode proteins. This technique is refered to as the EST (expressed sequence tag) approach. Although theoretically a cDNA library should represent all genes that are expressed by a cell at a given time, in practice, the library is biased for genes expressed at high levels. Those genes that are highly expressed or those that are expressed under “standard” conditions are well represented in the cellular mRNA pool, will be well represented in the cDNA library and so will be readily identified. Those genes that are expressed at low levels, however, are poorly represented in the cellular mRNA pool and may not be recovered. Furthermore, genes expressed under “unusual” conditions would not be recovered if these unusual conditions cannot be duplicated in the laboratory. In contrast to the cellular mRNA pool, all genes are represented in equi-molar concentrations in the genome. For this reason, a genomic DNA library is more advantageous than a cDNA library for gene discovery if a method can be found for differentiating clones containing genic sequences from those containing nongenic sequences.