The present invention relates generally to the field of molecular biology and in particular to the creation and use of gene libraries containing cloned cDNAs that encode expressed genes.
A common practice in molecular biology is to create xe2x80x9cgene libraries,xe2x80x9d which are collections of cloned fragments of DNA that represent genetic information in an organism, tissue or cell type. To construct a library, desired DNA fragments are prepared and inserted by molecular techniques into self-replicating units generally called cloning vectors. Each DNA fragment is therefore represented as part of an individual molecule, which can be reproduced in a single bacterial colony or bacteriophage plaque. Individual clones of interest can be identified by various screening methods, and then grown and purified in large quantities to allow study of gene organization, structure and function.
Only a small fraction of the genetic information for an organism is actually used in an individual cell or tissue at a particular time. A cDNA library is a type of gene library in which only DNA for actively expressed genes is cloned. These active genes can be selectively cloned over silent genes because the DNA for active genes is transcribed into messenger RNA (mRNA) as part of the pathway by which proteins are made. RNA molecules are polar in nature, i.e. the constituent nucleoside bases are linked via phosphodiester bonds between the 3xe2x80x2 ribosyl position of one nucleoside and the 5xe2x80x2 ribosyl position on the following nucleoside. RNA is synthesized in the 5xe2x80x2 to 3xe2x80x2 direction, and mRNAs are read by ribosomes in the same direction, such that proteins are synthesized from N-terminus to C-terminus. Over the past decade, cDNA libraries have become the standard source from which thousands of genes have been isolated for further study.
cDNA libraries may be expression libraries, whereby the cDNAs are transcribed and translated, resulting in the production of polypeptides corresponding to mRNA-encoded proteins. The activity of cDNA expression products may be assayed, and the function of corresponding mRNAs and proteins encoded thereby may be determined.
Full length cDNA, which comprises the entire open reading frame (ORF) of an mRNA, is desirable for many applications. Alternatively, partial cDNA and cDNA fragments are useful in some applications, for example, identifying functional domains within proteins. Interestingly, microdomains can exert unique biological effects compared to the parental molecules from which they are derived (Lorens et. al., Mol. Therapy, 1:438-447, 2000). The ability to express protein microdomains can be a powerful means to subtly perturb cellular physiology in manners that reveal new paths for therapeutic intervention.
The use of retroviruses is desirable for the stable transduction of genetic material into host cells, particularly host cells which are poorly transfectable, such as myoblasts and lymphocytes.
One object of the present invention is to provide methods and compositions for stably expressing genetic effectors, comprising random cDNAs, in host cells.
An additional object of the invention is to provide methods and compositions to screen for genetic effectors, comprising random cDNAs, that alter cell phenotype in a desirable way.
The present invention provides methods and compositions for producing directional random cDNA libraries. Directional random cDNA libraries comprising pluralities of directional random cDNA expression vectors, and methods of using these libraries, are also provided.
In one aspect of the invention, directional random cDNA expression vector libraries are provided. Each library comprises a plurality of directional random cDNA expression vectors. In a preferred embodiment, libraries comprising expression vectors with random cDNA in sense orientation are provided. In another embodiment, libraries comprising expression vectors with random cDNA in antisense orientation are provided. In another embodiment, libraries comprising a mixture of expression vectors with random cDNAs in sense orientation and antisense orientation are provided. As discussed below, the methods provided herein for making random cDNA libraries involve the directional cloning of random cDNAs into expression vectors. Accordingly, the orientation of a random cDNA in each vector is predetermined, facilitating construction of sense libraries, antisense libraries, and mixtures thereof. Such a scheme provides for the expression of antisense nucleic acid and nucleic acid corresponding in sequence to mRNA, as desired.
It will be understood that the cDNA libraries of the present invention comprise vectors, which comprise random cDNAs, which random cDNAs are directionally positioned in expression vectors in sense orientation, or antisense orientation. These libraries are sometimes referred to herein as directional random cDNA libraries. For the ease of description, the terms xe2x80x9cdirectionalxe2x80x9d and xe2x80x9crandomxe2x80x9d will often be omitted when referring herein to these libraries and methods of making the same.
In a preferred embodiment, the present invention provides cDNA expression vector libraries, each comprising a plurality of expression vectors, each vector comprising a) a first nucleic acid comprising a cDNA; b) a second nucleic acid which is a fusion partner; and c) a transcriptional regulatory sequence recognized by a host cell, wherein the first and second nucleic acids form a fusion nucleic acid which is operably linked to the transcriptional regulatory region (sometimes referred to herein as a transcriptional regulatory sequence). In some embodiments, the vectors also comprise a translational regulatory region (sometimes referred to herein as a translational regulatory sequence or start site) which forms part of the fusion nucleic acid and initiates translation of the fusion nucleic acid.
Preferred cDNAs for use in the present invention comprise sequences complementary to complete or near complete 5xe2x80x2 mRNA ends, including native translational start sites, which facilitate translation of cDNA encoded transcript in a host cell.
Other cDNAs may be used however, as will be appreciated by those in the art. For example, cDNAs lacking native translation start sequences, and comprising sequences complementary to 3xe2x80x2 mRNA ends also find use in some embodiments of the present invention.
In a preferred embodiment, the fusion partner encodes a detectable protein. In a preferred embodiment, the detectable protein is an autofluorescent protein. In a further preferred embodiment, the autofluorescent protein is a green fluorescent protein (GFP). In a further preferred embodiment, the autofluorescent protein is a GFP from Aequorea, or one of the well known variants thereof including red flourescent protein (RFP), blue fluorescent protein (BFP), and yellow fluorescent protein (YFP). In another further preferred embodiment, the autofluorescent protein is a GFP from Renilla. In another further preferred embodiment, the autofluorescent protein is a GFP from Ptilosarcus. In another preferred embodiment, the autofluorescent protein is a GFP homologue from Anthozoa species (Matz et al., Nat. Biotech., 17:969-973, 1999).
In a preferred embodiment, the first nucleic acid is fused to the 5xe2x80x2 end of the second nucleic acid. The expression products of such a vector include a fusion nucleic acid wherein cDNA encoded sequence is located at the 5xe2x80x2 end and nucleic acid sequence encoding detectable protein is located at the 3xe2x80x2 end. Expression products also include a fusion protein that comprises an N-terminal polypeptide encoded by cDNA and a C-terminal polypeptide which is a detectable protein moiety. In embodiments where cDNA is inserted in antisense orientation, the expression products include a fusion nucleic acid wherein antisense nucleic acid is located at the 5xe2x80x2 end and nucleic acid sequence encoding detectable protein is located at the 3xe2x80x2 end.
In a preferred embodiment, the expression vector does not comprise a heterologous translation start site for the initiation of cDNA transcript translation.
In another embodiment, the expression vector comprises an heterologous translation start site for initiating translation of a cDNA transcript. In embodiments where cDNA is in antisense orientation, the heterologous translation start site provides for the translation of antisense cDNA transcripts. In embodiments where cDNA is in sense orientation, cDNA transcripts may be translated in frame or out of frame, depending on the positioning of the cDNA relative to the heterologous translation start site. cDNAs translated out of frame, and cDNA antisense transcripts, encode what are herein referred to as xe2x80x9crandom peptidesxe2x80x9d.
Translation of cDNA transcripts out of frame may present internal xe2x80x9cstopxe2x80x9d codons (TAA, TGA, TAG), interrupting or inhibiting cDNA translation. Stop codons may also be encountered in antisense transcripts. For clarity of description, the occurrence of internal translational xe2x80x9cstopxe2x80x9d codons within cDNA antisense transcripts and cDNAs translated out of frame is not treated in every relevant embodiment discussed herein, though it is understood that such xe2x80x9cstopxe2x80x9d codons may occur.
In one embodiment, the first nucleic acid is fused to the 3xe2x80x2 end of the second nucleic acid. The expression products of such a vector include a fusion nucleic acid wherein cDNA encoded sequence is located at the 3xe2x80x2 end and nucleic acid sequence encoding detectable protein is located at the 5xe2x80x2 end. Expression products may also include a fusion protein that comprises a C-terminal polypeptide encoded by cDNA and an N-terminal polypeptide which is a detectable protein moiety. Some cDNAs will be translated in frame while others will translate out of frame, encoding what are herein referred to as xe2x80x9crandom peptidesxe2x80x9d. In embodiments where cDNA is in antisense orientation, the expression products include a fusion nucleic acid wherein antisense nucleic acid is located at the 3xe2x80x2 end and nucleic acid sequence encoding detectable protein is located at the 5xe2x80x2 end. In addition, antisense transcripts may be translated yielding fusion proteins comprising an N-terminus polypeptide which is a detectable protein moiety and a C-terminus peptide which is encoded by antisense cDNA transcript.
In another embodiment, the first nucleic acid is positioned within the second nucleic acid (e.g., the second nucleic acid comprises the first nucleic acid). Expression products of such vectors include fusion nucleic acids wherein cDNA-encoded sequence is located within nucleic acid sequence encoding detectable protein. Expression products also include fusion proteins that comprise cDNA-encoded peptides within detectable proteins, preferably in the surface exposed loop region of a detectable protein, as described herein. Some cDNAs will be translated in frame while others will translate out of frame, encoding what are referred to herein as random peptides. In embodiments where cDNA is inserted in antisense orientation, the expression products include fusion nucleic acids wherein antisense nucleic acid is located within nucleic acid sequence encoding detectable protein. In addition, antisense nucleic acids may be translated if stop codons are not encountered, yielding fusion proteins that comprise antisense encoded peptide within detectable protein.
In a preferred embodiment, expression vectors additionally comprise a third nucleic acid sequence, referred to herein as a linker, which is interposed between the first and second nucleic acids. In this embodiment, the linker may encode a linking peptide that joins cDNA encoded peptide to the detectable protein moiety in a fusion protein. Alternatively, as outlined, the linker may be a separation sequence that provides for the expression of separate cDNA encoded peptide and detectable protein moieties.
In a preferred embodiment, the linker connecting the first and second nucleic acids comprises an internal ribosome entry site (IRES). Such a linker may be used to fuse the first nucleic acid to the 5xe2x80x2 end or the 3xe2x80x2 end of the second nucleic acid. The expression products of such a vector include a fusion nucleic acid and two separate polypeptides translated from a fusion nucleic acid, particularly a first polypeptide which is encoded by a cDNA, and a second polypeptide which is a detectable protein.
In another embodiment, the linker connecting the first and second nucleic acids comprises a cleavage site. Such a linker may fuse the first nucleic acid to the 5xe2x80x2 end or the 3xe2x80x2 end of the second nucleic acid. The expression products of such a vector include a fusion nucleic acid, and a fusion protein wherein the cDNA-encoded polypeptide moiety and the detectable protein moiety are separated by an intervening cleavage site which is a polypeptide sequence that is recognized by a protease. This site provides for cleavage of the covalent peptide linkage which fuses the cDNA-encoded polypeptide moiety to the detectable protein moiety in the fusion protein and thereby provides for the expression of two separate polypeptides.
In another embodiment, the linker comprises a 2a sequence. Such a linker may fuse the first nucleic acid to the 5xe2x80x2 end or the 3xe2x80x2 end of the second nucleic acid. The expression products of such a vector include a fusion nucleic acid and two separate polypeptides translated from a fusion nucleic acid, particularly a first polypeptide which is encoded by a cDNA, and a second polypeptide which is a detectable protein.
In a preferred embodiment, cDNA expression vectors comprise a fusion partner, in addition to the second nucleic acid encoding a detectable protein. The fusion partner may be fused or linked to the first or second nucleic acid, or both.
In some embodiments, the second nucleic acid is a fusion partner other than a fusion partner encoding a detectable protein.
In some especially preferred embodiments, the cDNA expression vectors provided are retroviral vectors. Accordingly, retroviral cDNA expression vectors and libraries comprising the same are provided herein. In a preferred embodiment, retroviral vectors comprising random cDNAs which are operably linked to transcriptional regulatory sequence in sense orientation are provided. In another embodiment, retroviral vectors comprising random cDNAs which are operably linked to transcriptional regulatory sequence in antisense orientation are provided. In another embodiment, libraries comprising a mixture of retroviral vectors with random cDNAs in sense orientation and antisense orientation are provided.
In a preferred embodiment, the present invention provides retroviral expression vector libraries, each comprising a plurality of retroviral expression vectors, each vector comprising a) a first nucleic acid comprising a cDNA; b) a second nucleic acid which is a fusion partner; and c) a transcriptional regulatory sequence recognized by a host cell, wherein the first and second nucleic acids form a fusion nucleic acid which is operably linked to the transcriptional regulatory region. In some embodiments, the vectors also comprise a translational regulatory region which forms part of the fusion nucleic acid and initiates translation of the fusion nucleic acid.
In a preferred embodiment, the retroviral cDNA expression vectors provided herein comprise a self-inactivating 3xe2x80x2 long terminal repeat (LTR) region which is located 3xe2x80x2 of the first and second nucleic acids. These vectors are sometimes referred to as SIN vectors.
In a preferred embodiment, the retroviral cDNA expression vectors provided herein comprise a tetracycline-inducible (tet-inducible) promoter with an orientation opposite to the LTR and are SIN vectors. Preferred tet-inducible promoters comprise multiple copies of the tet operon operably linked to a minimal human cytomegalovirus (CMV) promoter (for example, see Gossen et al., PNAS 89:5547-5551, 1992).
In one aspect of the present invention, methods for producing random cDNA expression vectors, and libraries comprising the same, are provided. The methods involve the directional cloning of random cDNAs into expression vectors using particular adaptors and cloning sites, described below. In a preferred embodiment, the expression vectors are retroviral expression vectors. Accordingly, in a preferred embodiment, methods for producing retroviral random cDNA expression vectors, and libraries comprising the same, are provided.
In one aspect of the present invention, methods of screening for a bioactive agent capable of altering the phenotype of a cell in a desirable way are provided. In a preferred embodiment, the methods comprise the steps of a) introducing a cDNA expression vector library into a plurality of cells; b) screening the plurality of cells for a cell exhibiting a phenotype which is altered in a desirable way, wherein the altered phenotype is due to the expression of a cDNA. The methods may also comprise any of the steps of c) isolating at least one cell exhibiting an altered phenotype; d) isolating a nucleic acid comprising the cDNA from the cell exhibiting an altered phenotype; e) identifying the bioactive agent; and f) identifying and/or isolating the molecule(s) to which the agent binds. Additionally, in some preferred embodiments, the methods involve stimulating the plurality of cells in manner known to produce a disease-like response or a phenotype of the disease process. In an especially preferred embodiment, retroviral cDNA libraries provided herein are used.
In another preferred embodiment of this aspect of the invention, the methods comprise the steps of a) introducing a cDNA expression vector library into a first plurality of cells; b) contacting the first plurality of cells with a second plurality of cells; and c) screening the second plurality of cells for a cell exhibiting a phenotype which is altered in a desirable way, wherein the altered phenotype is due to contact with the first plurality of cells and expression of cDNA in the first plurality of cells. The method may also comprise any of the steps of d) isolating a cell from the first plurality of cells which is contacted with at least one cell in the second plurality of cells exhibiting an altered phenotype; e) isolating a nucleic acid comprising the cDNA from the cell isolated from the first plurality of cells; f) identifying the bioactive agent; and g) identifying and/or isolating the molecule(s) to which the agent binds. In an especially preferred embodiment, retroviral cDNA libraries provided herein are used.
In preferred embodiments of this aspect of the invention, methods of screening for bioactive agents capable of modulating the following physiological processes or biochemical activities are provided: IgE production in B cells; mast cell activation by IgE binding; mast cell degranulation; B cell activation and antibody secretion in response to antigen receptor stimulation; T cell activation in response to antigen receptor stimulation; epithelial cell activation; E3 ubiquitin ligase activity; inflammation induced by E3 ubiquitin ligase activity; inflammation induced by TNF activity; apoptosis in activated T cells; angiogenesis; uncontrolled cell proliferation; uncontrolled cell proliferation mediated by E3 ubiquitin ligase activity; and translation of Hepatitis C-encoded proteins.
Bioactive agents interact with target molecules to modulate cell phenotype. Provided herein are methods for isolating and identifying a target molecule using either the cDNA insert of a cDNA expression vector or an expression product thereof, including nucleic acids and polypeptides. Target molecules may be used to characterize signaling pathways, provide lead compounds for pharmaceutical development, and to screen for bioactive agents, including small molecule chemical compounds, capable of modulating target molecule activity.