In general, the invention features a cloning vector useful, for example, for mRNA expression pattern analysis.
Messenger RNA expression pattern comparison between different cells or tissues is becoming increasingly important in biomedical research. For example, conclusions about errors in gene regulation can be made from a comparison between healthy and diseased tissue. In addition, comparisons between pharmaceutically-treated and untreated tissues, cells, or control animals permit conclusions to be drawn about the mechanisms of action of pharmaceuticals. Comparisons between different tissues or cell types also permit the identification of differentiation or control genes.
Various methods have been developed for representing mRNA expression patterns, but all generally possess certain disadvantages. For example, methods based on subtractive cDNA libraries typically detect only large differences in expression patterns. Techniques based on differential display RT-PCR (and further developments thereof) are able to analyze only a restricted subset of all genes and are generally very time-consuming and error-prone.
The expressed sequence tag (EST) approach analyzes expression patterns by sequencing many clones from cDNA libraries. Even short sequences of 3' cDNA ends (that is, marker or "tag" sequences) may be used to unambiguously identify a gene. In addition, different frequencies of cDNAs in different libraries permit conclusions to be drawn about changes in gene expression. Although this approach provides very accurate quantitative information, it is very labor-intensive. Further developments of this method have concentrated primarily on increasing the throughput by means of serial or parallel sequencing of many short markers.
A number of techniques for gene expression analysis have been described. For example, U.S. Pat. No. 5,695,937 describes serial analysis of gene expression (SAGE) in which short cDNA sequences are first prepared from mRNAs. They are then dimerized and multimerized and, after cloning, manually sequenced. The disadvantage of this method is that only a small part (&lt;20 bp) of the cDNA may generally be cloned and identified by sequencing.
Another technique is described in U.S. Pat. No. 5,459,037. This patent describes a method for simultaneous sequence-specific identification of mRNAs in an mRNA population in which a primer mixture is used to synthesize corresponding cDNAs. The cDNAs are in turn transcribed into cRNAs with the aid of RNA polymerases, and PCR amplification is then carried out. The expression pattern is analyzed by comparing the intensities of the bands. The disadvantage of this method is that the PCR step frequently gives erroneous results.
U.S. Pat. No. 5,712,126 describes the selective PCR amplification of the 3' ends of cDNA fragments. This technique does not use a primer mixture, but 12 different cDNA syntheses are carried out, and thus there is corresponding additional complexity. Moreover, the expression patterns are analyzed by comparing the intensities of the bands, with a corresponding range of error.
Another problem in the analysis of gene expression patterns is that cDNA libraries generally contain a high percentage of clones containing only incomplete or no cDNAs. These reduce the analysis throughput and may falsify the results of the analysis.