During the 1970's and 1980's a number of techniques became available for identifying and isolating nucleic acids that encode proteins. Such techniques, or cloning methods, traditionally involved lengthy multistep processes which were based on knowledge of protein structure, availability of an antibody specific to the protein, or on ability to isolate an mRNA species corresponding to the protein. The traditional cloning methods were labor-intensive and were highly dependent on the abundance of the protein or nucleic acid within the cell.
U.S. Pat. No. 4,675,285 discloses a cloning method which is based solely on the ability to detect expression of a protein in an assay. In this method, a cDNA library is prepared from a cell that expresses the desired protein. The cDNA library is then inserted into an isolation (or transient) expression vector plasmid which is capable of directing DNA replication in bacterial cells in addition to being capable of directing both DNA replication and protein expression in mammalian cells. The isolation expression vector plasmid containing the cDNA library is transformed into bacteria, and individual colonies of bacteria containing the plasmids are obtained by diluting the bacterial culture and spreading the dilutions onto a solid surface of nutrient agar contained within a Petri dish or plate. By virtue of the dilutions, single bacteria can adhere to the agar surface and can be grown to yield discreet and easily identified colonies. Each bacterial colony contains a single isolation expression vector/cDNA plasmid. In the method of U.S. Pat. No. 4,675,285, this step yields about 2000 bacterial colonies per plate.
In the method of U.S. Pat. No. 4,675,285, the bacterial colonies are then lifted onto nitrocellalose master filters and replica plated. A predetermined number of bacterial colonies on each replica filter is combined to form a heterogeneous pool of colonies, and the plasmid DNA is isolated from each pool. The plasmid DNA is then transfected or microinjected into a mammalian host cell and the proteins encoded by the cDNAs are expressed. Expression of a particular desired protein is detected using a detection system or assay. However, for the assay to be sensitive and specific, the cell type used to express the desired protein must itself be devoid of the activity of interest, which severely limits the range of assays that can be used to detect the desired protein.
In the method of U.S. Pat. No. 4,675,285, the number of bacterial colonies per heterogeneous pool is determined on the basis of the yield of protein from the mammalian host cell and by the sensitivity of the detection system or assay used to detect the expressed product. A tenfold reduction factor is applied to compensate for variability in the size of the individual bacterial colonies within the heterogeneous pools and for day-to-day variability in growth of the mammalian host cells. Using this method for predetermining pool sizes, U.S. Pat. No. 4,675,285 discloses detection of single colonies containing cDNAs encoding several desired proteins, within pools of 500-1000 heterogeneous colonies. U.S. Pat. No. 4,675,285 discloses that isolation of the plasmid DNA from each individual bacterial colony is generally impractical, since large numbers of cDNA clones generally must be assayed in order to isolate a particular cDNA, especially if the cDNA corresponds to a rare mRNA.
Many known expression cloning methods, including that of U.S. Pat. No. 4,675,285, are dependent on transient gene expression in a mammalian host cell. Cells which support transient gene expression allow extrachromosomal replication of the isolation expression vector/cDNA plasmid, resulting in high copy numbers of the plasmid and correspondingly high levels of protein expressed from the cell. Most commonly the African green monkey kidney COS cell line is used for this purpose. As set forth in U.S. Pat. No. 4,675,285, COS cells exhibit variability in their day-to-day behavior which can affect protein expression levels and thus the outcome of the expression cloning experiments. In addition, transient gene expression in cells requires use of large mounts of highly purified cDNA to transfect the cells, making the technique laborious and expensive. Furthermore, the technique is time-consuming, since the cells require several days of growth to express detectable levels of protein. Successful expression of a particular gene in COS cells is not guaranteed, especially if expression of the gene is subject to strict regulatory controls derived from the cell in which it naturally occurs, or if the gene is not derived from a mammal. Many transfectable eukaryotic cells do not support transient gene expression, and thus use of transient gene expression limits the number of genes which may be identified by the expression cloning approach.
In vitro expression systems contain little or no endogenous mRNA, allowing specific labeling of the proteins encoded by cDNAs added to the systems. Such systems also lack many of the biological activities that can be present in cells, allowing a broad range of assays to be employed to detect expressed proteins. Although use of in vitro expression of the isolation expression vector/cDNA is suggested in U.S. Pat. No. 4,675,285, this technique has not been generally adopted by those of skill in molecular biology for expression cloning. A principal reason for this is that the number of heterogeneous cDNAs in the large pools of U.S. Pat. No. 4,675,285 would not yield discreet and easily detectable expressed proteins in an in vitro translation system.