This invention relates to cloning systems using marker inactivation for the identification of recombinants containing the insertion of a nucleic acid molecule. More particularly, the present invention relates to lacZxcex1 gene fragments having improved accuracy and reliability in detecting the insertion of a nucleic acid molecule therein.
The industrial applications of genetic engineering are becoming evident in the production of pharmaceuticals, of foods having improved properties, and of chemical products (including enzymes) to facilitate manufacturing processes. The process of genetic engineering may begin by cloning a gene of interest which encodes a protein with the desired properties for the particular industrial application. Typically, cloning a gene is done by either breaking up a genome into manageable sized fragments, or generating cDNA fragments from isolated mRNA, and then cloning those genomic or cDNA fragments into a vector and introducing the resultant recombinant vectors into a competent host cell. Commonly used methods for screening transformants, to identify a transformant that contains a recombinant vector with a nucleic acid molecule inserted therein, include marker inactivation systems, including marker inactivation systems which utilize various indicator or reporter genes including lacZ or lacZxcex1, galK, the gene for chloramphenicol acetyltransferase, the gene for the green fluorescent protein (GFP) and mutant forms thereof (see Cubitt et al, 1995, Trends in Biochem. 20:448-455), the gene for luciferase and the like; and positive selection systems which utilize lethal genes including ccdB (Bernard et al., 1994, Gene 148:71-74), the gene for mouse transcription factor GATA-1 (Trudel et al., 1996, BioTechniques 20:684-693), the gene for thymidine kinase, the gene for xcex2-lactamase and the like.
The lac operon marker inactivation system, is employed in one of the most widely used color selection systems for plasmids and single-stranded DNA (ssDNA) vectors (see, e.g., Messing et al., 1977, Proc. Natl. Acad. Sci. USA 74:3642-3646; Messing et al., 1981, Nucl. Acids Res. 9:309-321; Messing, 1983, Methods Enzymol. 101:20-78; and Yanisch-Perron et al., 1985, Gene 33:103-119). Essentially, the lac operon marker inactivation system functions by intracistronic complementation between the xcex1-peptide encoded by the lacZxcex1 gene fragment, and a xcex2-galactosidase molecule that most commonly carries a deletion of amino acids 12 through 42.
lacZxcex1 is a gene fragment, comprising the proximal portion of the Escerichia coli lacZ gene, which encodes approximately 60 of the amino terminal amino acids of the xcex2-galactosidase polypeptide chain. The encoded product, the xe2x80x9cxcex1-peptidexe2x80x9d, complements the defective activity of the gene product of lacZM15, an allele that carries a spontaneous deletion of the codon for amino acids 12 through 42 of xcex2-galactosidase. Thus, to identify a transformant that contains a recombinant vector with a nucleic acid molecule inserted therein, vector having a cloning site in the lacZxcex1 gene fragment is introduced into a host cell expressing a xcex2-galactosidase having a deletion of amino acids 12 through 42. Transformants, presumably containing vector carrying an intact lacZxcex1 gene fragment, produce blue colonies or plaques when applied onto media containing a chromogenic xcex2-galactosidase substrate. This is because functional xcex2-galactosidase activity is achieved by complementation between the xcex1-peptide and a xcex2-galactosidase molecule carrying the deletion, thereby cleaving a chromogenic substrate such as 5-bromo-4-chloro-3-indolyl-xcex2-D-galactoside (xe2x80x9cX-galxe2x80x9d) to produce deep blue dibromodichloroindigo. In contrast, transformants containing vector carrying a lacZxcex1 gene fragment having an insertion produce colorless (white) colonies or plaques when similarly plated. Colorless colonies result when the inserted nucleic acid molecule interrupts expression of the lacZxcex1 gene fragment so that the complementing xcex1-peptide is not produced.
Currently, all lacZxcex1-based vectors (e.g. Messing et al., 1977, supra; Yanisch-Perron et al., 1985, supra; Guan et al., 1987, Gene 67:21-30; Short et al., 1988, Nucl. Acids Res. 16:7583-7600; Alting-Mees and Short, 1989, Nucl. Acids Res. 17:9494; Evans et al., 1995, Biotechniques 19:130-135; and U.S. Pat. No. 4,766,072) employ the same mechanism for color selection. This mechanism involves placement of restriction sites for insertion of a nucleic acid molecule upstream of the codon for amino acid 7 of xcex2-galactosidase, wherein the inserted nucleic acid molecule (xe2x80x9cinsertxe2x80x9d) results in interference with the expression, but not the activity, of the lacZ xcex1-peptide. The current marker inactivation configuration has the disadvantage in that problems arise in the detection of recombinant molecules. More specifically, false positives (white colonies or plaques containing vector not having an insert) and false negatives (colored colonies or colored plaques containing vector that have an insert) may be generated (see, e.g., Messing, 1983, supra; unpublished observations; and Table 2 herein).
Although false positive results are difficult to eliminate owing to the fact that they arise to a large extent out of factors which are extraneous to the selection system itself, these do not generally constitute a problem since they are selected alongside actual positives and are subjected to further scrutiny before their fate is decided. Among the external factors responsible for generating false positives are (i) contamination of restriction and modification enzymes with exonucleases, polymerases or other restriction enzymes; (ii) spontaneous mutations; and (iii) loss of the Fxe2x80x2 episome carrying the lacZM15 allele.
False negatives, on the other hand, represent a problem as they are rarely carried forward for further examination and, as a result, are responsible for numerous erroneous conclusions. Such erroneous conclusions include, at least in part, the general phenomenon referred to as xe2x80x9cnon-clonable sequencesxe2x80x9d, and the presence of an excessive number of gaps in shotgun DNA sequencing results. False negatives are caused by both extrinsic factors, as well as factors which are intrinsic to the architecture of the color selection mechanism itself. In the currently available lacZxcex1-based vectors, there are two principal causes of false negatives: (i) in-frame insertion of DNA fragments containing one or more open reading frames; and (ii) reinitiation of translation within the mRNA transcribed from the inserted DNA fragment at any in-frame AUG, GUG or even UUG and CUG preceded by a pseudo Shine-Delgarno box. Events arising out of either of these two instances result in the synthesis of xcex1-peptides bearing aminoterminal fusions. Since neither amino nor carboxyterminal fusions to the xcex1-peptide usually impair its activity (see, e.g., Slilaty et al., 1990, Eur. J. Biochem. 194:103-108), blue colonies or blue plaques indistinguishable from those colonies or plaques produced by vectors not carrying an insert are formed. The number of false negatives produced in like manner is further augmented by the fact that even the less frequent fusions, having diminished levels of xcex1-peptide activity, produce blue colonies or blue plaques due to the hypersensitivity of the X-gal assay system. The hypersensitivity of the X-gal system represents the fact that very little xcex2-galactosidase activity is needed for a complete color-producing reaction to take place.
Hypersensitivity of the X-gal assay system is also responsible for another source of false negatives. This source of false negatives arises as a result of xcex2-galactosidase-like activity produced by the ebg locus of the host cell. The ebg (evolved xcex2-galactosidase) operon is located directly across the chromosome from lacZ and codes for an enzyme that has low level xcex2-galactosidase-like activity (Hall et al., 1989, Genetics 123:635-648). In wild-type strains, this enzyme does not have enough activity to allow growth on lactose. However, in typical screening protocols, host cells suspected of being transformants are grown in the presence of an inducer of lacZxcex1 gene expression. In such circumstances, the enzyme typically having a low level xcex2-galactosidase-like activity has enough activity in the presence of such inducers (e.g., isopropyl thiogalactoside or xe2x80x9cIPTGxe2x80x9d) to cleave the chromogenic substrate X-gal, thus yielding bluish colonies, or more frequently white colonies with blue centers (unpublished observations). The effects of the ebg locus on blue color formation, in colonies that otherwise would be white, may be minimized by avoiding long incubation periods of plated cells (less than 18 hours), or completely eliminated by using hosts carrying a defective ebg locus.
Thus, there is a need for a cloning vector utilizing the lacZxcex1 marker inactivation system, wherein the cloning vector is based on a configuration which minimizes the generation of false negatives. Such a novel cloning vector allows for improved accuracy and reliability in detecting the inactivation of the lacZxcex1 gene fragment caused by insertion of a nucleic acid molecule. The novel cloning vector may be used for general cloning purposes, as well as for gap-free shotgun sequencing, in facilitating industrial applications of gene isolation, genetic engineering and development of ordered genomic libraries.
In accordance with the present invention, disclosed is a marker inactivation system which utilizes lacZxcex1 in a configuration which minimizes the generation of false negatives during screening processes for recombinant clones.
In the development of the vector according to the present invention, it was an unexpected result to find that accurate and reliable inactivation of lacZxcex1 occurs only when a nucleic acid molecule is inserted in the region of the lacZxcex1 gene fragment that encodes amino acids 8 to 38 of xcex2-galactosidase. Thus, of the amino acids encoded by a lacZxcex1 gene fragment, residues corresponding to amino acids 8 to 38 of xcex2-galactosidase have been found to be required for functional xcex1-peptide activity for complementation in vivo.
Thus, in one embodiment of the present invention, the vector has at least one promoter operatively linked to a DNA sequence encoding an xcex1-peptide, wherein the resultant xcex1-peptide is capable of complementation with a defective xcex2-galactosidase molecule (e.g. one that carries a deletion of the amino acids 12 through 42) thereby resulting in xcex2-galactosidase activity. At least one cloning site, and preferably multiple cloning sites cleaved by distinct restriction enzymes, is included within the region of the DNA sequence encoding the xcex1-peptide, wherein the region corresponds to the DNA encoding amino acids 8 to 38 of xcex2-galactosidase as shown in SEQ ID NO:1. As appreciated by one skilled in the art from the disclosure of the present invention, modifying the wild type lacZxcex1 gene fragment to encode functional xcex1-peptides having altered codons as well as conservative and/or nonconservative substitutions included within, but not limited to, the region of amino acids 8 to 38 of xcex2-galactosidase, can produce DNA sequences with one or more restriction enzymes sites contained therein. Additional embodiments of the present invention include the inclusion in the vector of other features useful for protein expression and other molecular manipulations including, but not limited to, DNA sequences selected from the group consisting of one or more antibiotic resistant genes or auxotrophic genes to aid in selection of recombinants, a ribosome binding site, regulatory elements, at least one origin of replication (xe2x80x9crepliconxe2x80x9d), a transcription terminator, at least one phage promoter, a phage origin of replication and combinations thereof. Those skilled in the art will recognize that the teachings provided herein can readily be applied to indicator, marker, reporter, or positive selection genes other than lacZ or lacZxcex1 to produce cloning vectors which minimize the generation of false negatives during screening processes for recombinant clones as detailed herein for lacZxcex1.
A preferred plasmid vector constructed in accordance with the present invention, designated pTrueBlue(trademark), was constructed using commercially available plasmids, and using standard methods known to those skilled in the art including restriction enzyme digestion, and site-directed mutagenesis.
A preferred phage vector constructed in accordance with the present invention, designated M13TrueBlue(trademark), was constructed using commercially available phage, and using standard methods known to those skilled in the art including restriction enzyme digestion, and site-directed mutagenesis.
A preferred bacterial artificial chromosome vector constructed in accordance with the present invention, designated TrueBlue-BAC(trademark), was constructed using commercially available vector and using standard methods known to those skilled in the art including enzyme digestions and ligations.
The vector according to the present invention is utilized by cleaving the vector with at least one restriction enzyme that is specific to at least one selected restriction site which has been introduced in the region corresponding to the DNA encoding amino acids 8 to 38 of xcex2-galactosidase as illustrated in SEQ ID NO:1. A nucleic acid molecule is then cloned into the cleaved vector. The resultant recombinant vectors are introduced into competent host cells, and transformed host cells are then selected for and screened by growth in the presence of a chromogenic substrate (e.g., X-gal or MacConkey agar) which can be acted upon by xcex2-galactosidase. Clones containing vector carrying an intact lacZxcex1 gene fragment produce colored colonies or plaques when grown in the presence of media containing a chromogenic xcex2-galactosidase substrate. Clones containing vector carrying a lacZxcex1 gene fragment according to the present invention and having an insertion therein produce colorless (white) colonies or plaques when similarly plated.
In a further embodiment of the plasmid vector according to the present invention, the plasmid vector has been designed to provide capabilities for in vitro preparation of RNA probes, creation of nested deletions through ExoIII protection sites, manipulation of large DNA inserts via sites for 8-base cleaving restriction enzymes, preparation of ssDNA, and protein expression.
These and other objects, features, and advantages of the present invention will become apparent from the following drawings and detailed description.