1. Field of the Invention
The invention is specifically directed to efficient, random, simple insertion of a transposon or derivative transposable element into DNA in vivo or in vitro. The invention is particularly directed to mutations in ATP-utilizing regulatory transposition proteins that permit insertion with less target-site specificity than wild-type. The invention encompasses gain-of-function mutations in TnsC, an ATP-utilizing regulatory transposition protein that activates the bacterial transposon Tn7. Such mutations enable the insertion of a Tn7 transposon or derivative transposable element in a non-specific manner into a given DNA segment. Insertion can be effected in plasmid and cosmid libraries, cDNA libraries, PCR products, bacterial artificial chromosomes, yeast artificial chromosomes, mammalian artificial chromosomes, genomic DNAs, and the like. Such insertion is useful in DNA sequencing methods, for genetic analysis by insertional mutagenesis, and alteration of gene expression by insertion of a given genetic sequence.
2. Description of the Background Art
Transposable elements are discrete segments of DNA capable of mobilizing nonhomologously from one genetic location to another, that typically carry sequence information important for two main functions that confer the ability to mobilize. They encode the proteins necessary to carry out the catalytic activity associated with transposition, and contain the cis-acting sequences, located at the transposon termini, that act as substrates for these proteins. The same proteins can participate in the selection of the target site for insertion.
The selection of a new insertion site is usually not a random process; instead, many transposons show characteristic preferences for certain types of target sites. One broad characteristic that differentiates the wide variety of transposable elements known is the nature of the target site selectivity (1). A component of this selectivity can be the target sequence itself. The bacterial transposon Tn10 preferentially selects a relatively highly conserved 9 bp motif as the predominant site for transposon insertion and less often selects other more distantly related sites in vivo (2). The Tc1 and Tc3 mariner elements of C. elegans insert preferentially at a TA dinucleotide such that each end of the element is flanked by a TA duplication (3) (4) (5). A lower specificity consensus sequence, N-Y-G/C-R-N has been determined from populations of both in vivo and in vitro insertions for the bacteriophage Mu (7). In contrast to these elements, the bacterial transposon Tn5 exhibits markedly lower insertion site specificity, although some isolated xe2x80x9chotspotsxe2x80x9d have been detected (8).
Another selection mechanism relies on structural features or presence of cellular protein complexes at the target sites. The yeast transposon Ty3 preferentially inserts into the promoters of genes transcribed by RNA polymerase III, responding to signals from cellular proteins TFIIIB and TFIIIC (9).
Understanding how these factors modulate transposase activity to impose target site preferences will lend insight into the spread of transposons and viruses, and may suggest ways to manipulate those target preferences. The bacterial transposon Tn7 is distinctive in that it uses several element-encoded accessory proteins to evaluate potential target DNAs for positive and negative features, and to select a target site (1). Tn7 encodes five genes whose protein products mediate its transposition (10) (11).
Two of the proteins, TnsA and TnsB, constitute the transposase activity, collaborating to execute the catalytic steps of strand breakage and joining (12). The activity of this transposase is modulated by the remaining proteins, TnsC, TnsD, and TnsE, and also by the nature of the target DNA.
TnsC, TnsD, and TnsE interact with the target DNA to modulate the activity of the transposase via two distinct pathways. TnsABC+TnsD directs transposition to attTn7, a discrete site on the E. coli chromosome, at a high frequency, and to other loosely related xe2x80x9cpseudo attxe2x80x9d sites at low frequency (13). The alternative combination TnsABC+E directs transposition to many unrelated non-attTn7 sites in the chromosome at low frequency (13) (10) (11) and preferentially to conjugating plasmids (14). Thus, attTn7 and conjugable plasmids contain positive signals that recruit the transposon to these target DNAs. The alternative target site selection mechanisms enable Tn7 to inspect a variety of potential target sites in the cell and select those most likely to ensure its survival.
The Tn7 transposition machinery can also recognize and avoid targets that are unfavorable for insertion. Tn7 transposition occurs only once into a given target molecule; repeated transposition events into the same target are specifically inhibited (15) (16). Therefore, a pre-existing copy of Tn7 in a potential target DNA generates a negative signal which renders that target xe2x80x9cimmunexe2x80x9d to further insertion. The negative target signal affects both TnsD- and TnsE-activated transposition reactions and is dominant to any positive signals present on a potential target molecule (16). Several other transposons, such as Mu and members of the Tn3 family, also display this form of negative target regulation (17) (18) (19) (7).
Target selection could be an early or late event in the course of a transposition reaction. For example, a transposon could constitutively excise from its donor position, and the excised transposon could then be captured at different frequencies by different types of target molecules. Tn10 appears to follow this course of events in vitro, excising from its donor position before any interactions with target DNA occur (20) (21). Alternatively, the process of transposon excision could itself be dependent on the identification of a favorable target site. Tn7 transposition shows an early dependence on target DNA signals in vitro: neither transposition intermediates nor insertion products are seen in the absence of an attTn7 target (22). Thus, the nature of the target DNA appears to regulate the initiation of Tn7 transposition in vitro.
An important question is how positive and negative target signals are communicated to the Tn7 transposase. Reconstitution of the TnsABC+TnsD reaction in vitro has provided a useful tool for detailed dissection of Tn7 transposition (22) (23). This reaction has been instrumental in delineating the role of each of the individual proteins play in target site selection. Dissection of the TnsABC+D reaction in vitro has implicated TnsC as a pivotal connector between the TnsAB transposase and the target DNA. TnsC is an ATP-dependent DNA-binding protein with no known sequence specificity (24). However, TnsC can respond to signals from attTn7 via an interaction with the site-specific DNA-binding protein TnsD. In a standard in vitro transposition reaction TnsD is required for transposition to the attTn7 site on a target DNA molecule. This site-specific insertion process is tightly regulated by TnsC, but does not occur in the absence of TnsD. Additional evidence for a TnsC-TnsD interaction comes from DNA protection and band shift analysis with attTn7 DNA (23). Direct interaction between TnsC and the TnsAB transposase has also recently been observed (25) (26).
Therefore, TnsC may serve as a xe2x80x9cconnectorxe2x80x9d or xe2x80x9cmatchmakerxe2x80x9d between the transposase and the TnsD+attTn7 target complex (23) (27). This connection is not constitutive, but instead appears to be regulated by the ATP state of TnsC. Only the ATP-bound form of TnsC is competent to interact with target DNAs and activate the TnsA+B transposase; the ADP-bound form of TnsC has neither of these activities and cannot participate in Tn7 transposition (24) (23). TnsC hydrolyzes ATP at a modest rate (25), and therefore can switch from an active to an inactive state. The modulation of the ATP state of TnsC may be a central mechanism for regulating Tn7 transposition.
The possibility that TnsC regulates the connection between the TnsA+B transposase and the target site prompted the inventor to predict that TnsC mutants can be isolated that would constitutively activate Tn7 transposition.
TnsC therefore became an excellent candidate for mutagenesis, to search for a gain of function protein capable of circumventing the requirement for targeting proteins. The inventor therefore identified gain-of-function TnsC mutants which can activate the TnsA+B transposase in the absence of TnsD or TnsE. They have characterized the ability of these mutants to promote insertions into various targets, and to respond to regulatory signals on those targets.
One class of TnsC mutants activates transposition in a way that is still sensitive to target signals, whereas a second class of TnsC mutants activates transposition in a way that appears to bypass target signals. As had been observed in vitro, the critical communication between the transposon and the target DNA appears to be an early event in the Tn7 reaction pathway in vivo, preceding the double-strand breaks at the transposon ends that initiate transposition.
A particular mutant isolated from the random mutagenesis is TnsCA225V, a mutant capable of an impressive activation of Tn7 transposition in the absence of TnsD (25). The single amino acid substitution made to generate TnsCA225V has altered the protein such that it no longer requires an interaction with the target-associated TnsD, enabling it to activate transposition to a variety of target molecules very efficiently (25) (26). The inventor concluded that TnsCA225V could promote transposition to target DNAs with low specificity based on results where transposition driven by the TnsABCA225V machinery was directed to either F plasmids containing an attTn7 site, F plasmids lacking an attTn7 site, or the E. coli chromosome with no apparent preference.
DNA Sequencing
Sequencing DNA fragments cloned into vectors requires provision of priming sites at distributed locations within the fragment of interest, if the fragment is larger than the sequence run length (amount of sequence that can be determined from a single sequencing reaction). At present there are three commonly used methods of providing these priming sites:
A) Design of a new primer from sequence determined in a previous run from vector-encoded primer or other previously determined primer (prime and run, primer walking)
B) Random fragmentation and recloning of smaller pieces, followed by determination of the sequence of the smaller pieces from vector-encoded (universal) priming sites, followed by sequence assembly by overlap of sequence (random shotgun sequencing).
C) Deletion of variable amounts of the fragment of interest from an end adjacent to the vector, to bring undetermined fragment sequence close enough to the vector-encoded (universal) primer to allow sequence determination.
All of these methods have disadvantages.
Method A is time-consuming and expensive because of the delay involved in design of new primers and their cost. Moreover, if the fragment contains DNA repeats longer than the sequence run, it may be impossible to design a unique new primer; sequence runs made with primers within the repeat sequence will display two or more sequences that cannot be disentangled.
Method B requires recloning; random fragmentation is difficult to achieve because fragments that are efficiently clonable (restriction enzyme digestion) do not have ends randomly distributed (Adams, M. D., Fields, C. and Ventor, J. C. editors Automated DNA Sequencing and Analysis Academic Press 1994; Chapter 6, Bodenteich, K. et al.), and fragmentation methods that provide randomly distributed ends (shearing, sonication) do not provide DNA ends that are efficiently clonable (with 5xe2x80x2 phosphate and 3xe2x80x2 OH moieties). Sequence assembly of is also difficult or impossible when two or more repetitive sequences longer than the sequence run are present in the starting fragment.
Method C depends on providing randomly distributed end points for enzymatically determined deletions. There are many methods for making such deletions (especially those involving exonuclease digestions, typically Exonuclease III), none of which provide entirely random endpoints and which depend on the presence of unique suitable restriction enzyme sites at one or both ends of the cloned fragment. However, because the deletion series in principle allows construction of a map (of nested remaining fragment lengths in deletion derivatives) that is independent of the sequence itself, this method can allow repetitive sequence longer than the sequence run to be located within the fragment at appropriate locations.
A method for introduction of universal priming sites at randomly distributed locations within a fragment of interest is therefore a useful advance in sequencing technology.
Transposition and the sequencing problem.
Previous efforts have been made to provide distributed priming sites by means of transposable elements. These methods have fallen short of this goal in three ways: first, the transposable elements have not provided a sufficiently random distribution of priming sites; second, the transposition method (carrying out transposition in vivo, followed by recovery of the targeted DNA and repurification) has been time-consuming and laborious; third, the Systems have been prone to produce undesired products. These undesired products include but are not limited to: a) cointegrates (replicon fusions) between the donor of the transposon and the target plasmid; b) insertions in which the two ends of the transposon act at different positions (leading to deletion of the intervening target); c) insertions of multiple copies of the transposon into the target, so that priming from one end of the transposon yields two superimposed sequences. The method has been laborious in two ways: the majority of insertions have been into chromosomal DNA of the host, and even for those insertions into the plasmid the recovery method has entailed loss of independence of insertions. in vitro methods of insertion have suffered from both the non-random location of insertion sites and the undesired products, and also from poor efficiency, so that it has been impractical to obtain large numbers of insertions into the target of interest without excessive labor.
Increasing interest in large scale sequencing projects and a concommitant search for highly efficient in vitro mutagenesis methods has promoted the adaptation of several in vitro transposon systems as tools to study genomes. An in vivo reaction for the bacterial transposon Tn3 has been used to efficiently sequence plasmid inserts of variable lengths; however, only approximately 37% of the nucleotides were found to be capable of serving as sites for insertion (Davies, 1995 #419). A similar, more random system has been developed for yeast retrotransposon Ty1, employing synthetic transposons with U3 ends as substrates and Ty1 virus-like particles supplying transposition functions (28) to sequence plasmids with yeast and human DNA inserts. A disadvantage to this method is the requirement for the cumbersome preparation of VLPs. In vitro transposition with an MLV integrase system has been utilized as a tool to dissect some of the mysteries of chromatin packaging (29) (30) (31) and as a tool for functional genetic footprinting (32). However, the MLV insertions do not appear to be completely random. An object of the invention therefore is to provide a transposon and transposition reaction with more random target site specificity. Therefore, the inventor examined the target site selectivity of the TnsCA225V machinery in vitro and explored the viability of this reaction as an effective tool for random insertional mutagenesis.
Accordingly, a general object of the invention is to provide a transposable system that achieves efficient, simple, non-specific or random insertion into any given DNA segment.
A further object of the invention is to provide a transposable system that achieves efficient random insertional mutagenesis via simple insertion.
Therefore, a specific object of the invention is to provide a transposable system that achieves efficient target site specificity that is reduced from wild-type and preferably random, via simple insertion.
A more particular object of the invention is to provide a transposon containing a mutation in a transposon-derived protein that allows efficient, simple insertion and target site selectivity that is reduced from the wild-type, and preferably random.
A more particular object of the invention is to provide a transposable system with a mutation in a transposon-derived ATP-utilizing regulatory protein. The mutation allows the efficient, simple, non-specific or random insertion of the transposable element into a DNA segment or at least provides reduced target site specificity from the wild-type.
A preferred object of the invention is to provide a Tn7 transposable system that achieves simple, efficient, non-specific or random insertion into a given DNA segment, or at least reduced target site specificity compared to the wild-type Tn7.
A preferred object of the invention is to provide a mutation in the Tn7 transposon that confers efficient, simple, non-specific insertion into a given DNA segment, or at least reduced target site specificity compared to the wild-type Tn7.
A preferred object of the invention is to provide a Tn7 transposable system with a mutation in the TnsC protein encoded in the Tn7 transposon, which mutation allows efficient, simple insertion with reduced target site specificity compared to the wild-type, and preferably allows non-specific insertion into a DNA segment.
Objects of the invention include methods for using the above compositions.
Accordingly, a general object of the invention is to provide a method for efficient, simple, random insertion of a transposable element into a given DNA segment.
A further object of the invention is to provide a method for efficient, simple, random insertional mutagenesis by a transposable element.
A specific object of the invention is to provide a method for efficient, simple, random transposition of a transposable element into a DNA segment, or in which the specificity of transposition is reduced compared to wild-type.
A more particular object of the invention is to provide a method for efficient, simple, random transposition of a transposable element into a DNA segment in which the specificity of transposition is reduced compared to the wild-type by using a transposable system containing a mutation that confers efficient, simple insertion with reduced target site specificity compared to the wild-type, and preferably random insertion.
A more particular object of the invention is to provide a method for efficient, simple, random transposition of a transposable element into a DNA segment or in which the specificity of transposition is reduced compared to wild-type, by using a transposable system with a mutation in an ATP-utilizing regulatory protein, the mutation allowing the efficient, simple, non-specific insertion of the transposable element into a DNA segment or at least providing for reduced target site specificity compared to the wild-type.
A preferred object of the invention is to provide a method for efficient, simple transposition of a transposable element into a DNA segment in which the specificity of transposition is reduced compared to wild-type, or is preferably random, by providing a Tn7 transposable system that is capable of non-specific insertion into a DNA segment, or at least reduced target site specificity compared to the wild-type Tn7.
A further object of the invention is to provide a method for efficient, simple transposition of a transposable element transposon into a DNA segment in which specificity of transposition is reduced compared to wild-type or is preferably random by providing a Tn7 mutation that confers the efficiency, ability to make a simple insertion, and the randomness or reduced specificity.
A further object of the invention is to provide a method for efficient, simple, random transposition of a transposable element into a DNA segment, or in which the specificity of transposition is reduced compared to the wild-type, by providing a mutation in the TnsC protein encoded in the Tn7 transposon, the mutation allowing a reduction in target site specificity compared to the wild-type and preferably allowing non-specific or random insertion of the Tn7 transposable element into a DNA segment.
A further object of the invention is to provide a method for DNA sequencing using a transposable system to introduce priming sites at randomly-distributed locations within a fragment of interest where the fragment is larger than the sequence run length.
A preferred object of the invention is to provide a method for DNA sequencing using a transposable system with a mutation that allows efficient and simple insertion and target site selectivity that is reduced from the wild-type and preferably random.
A preferred object of the invention is to provide a mutation in an ATP-utilizing regulatory protein. The mutation allows the efficient, simple, non-specific insertion of the transposon into a DNA segment or at least provides reduced target site specificity over wild-type.
A highly preferred object of the invention is to provide a method for DNA sequencing using a Tn7 transposable system that allows efficient, simple, non-specific insertion into a DNA segment or at least reduced target site specificity compared to the wild-type Tn7.
A highly preferred object of the invention is to provide a method for DNA sequencing using a Tn7 transposable system with a mutation in the TnsC protein, the mutation allowing efficient, simple insertion and a reduction in target site specificity compared to the wild-type and preferably allowing non-specific or random insertion of the Tn7 transposable element into the DNA segment.
A further object of the invention is to provide methods as described above that can be applied to any given DNA segment. These include, but are not limited to, plasmids, cellular genomes, including prokaryotic and eukaryotic, bacterial artificial chromosomes, yeast artificial chromosomes, and mammalian artificial chromosomes, and subsegments of any of these.
An object of the invention is to provide these methods in vitro or in vivo.
A further object of the invention is to provide kits for carrying out the above-described methods using the above-described transposons or parts thereof.
The inventor has accordingly developed a transposable system and methods that improve on in vitro and in vivo transmission methods previously described in that the methods are efficient for transposition, provide relatively random insertion, and almost all products recovered are simple insertions at a single site which thus provide useful information.
In a general embodiment of the invention, the invention is directed to a transposable system that achieves simple, efficient, random insertion into a given DNA segment.
In a further embodiment of the invention, the invention is directed to a transposable system that is capable of efficient random insertional mutagenesis, preferably by means of a simple insertion.
In a specific embodiment of the invention, the invention is directed to a transposable system with target site specificity that is reduced from the wild-type and preferably random, which allows simple and efficient insertion.
In a further specific embodiment of the invention, the invention is directed to a transposable system containing a mutation that allows target site specificity that is reduced from the wild-type and is preferably random.
In a preferred embodiment of the invention, the invention is directed to a transposable system with a mutation in an ATP-utilizing regulatory protein, the mutation allowing the efficient, simple, non-specific insertion of the transposon into a DNA segment or at least providing reduced target site specificity from the wild-type.
In a highly preferred embodiment of the invention, the invention is directed to a Tn7 transposable system that achieves efficient, simple, non-specific insertion into a given DNA segment, or at least reduced target site specificity compared to the wild-type Tn7.
In a highly preferred embodiment of the invention, the invention is directed to a mutation in a Tn7 transposon that confers the capability of efficient, simple, non-specific insertion into a DNA segment, or at least reduced target site specificity compared to the wild-type Tn7.
In a highly preferred embodiment of the invention, the invention is directed to a mutation in the TnsC protein encoded in the Tn7 transposon, the mutation allowing simple, efficient insertion and a reduction in target site specificity compared to the wild-type and preferably allowing non-specific or random insertion of the Tn7 transposition into a DNA segment.
In a specific disclosed embodiment of the invention, the invention is directed to a Tn7 mutant designated TnsCA225V, which is a mutant having an alanine to valine substitution at amino acid number 225 in the TnsC gene.
The invention also embodies methods for using all of the above compositions. Methods are directed to transposition or insertion of the transposable elements described above.
Accordingly, in one embodiment, the invention provides generally for efficient, simple, random insertion of a transposon into a given DNA segment, or at least insertion with reduced specificity compared to the wild-type.
In a further embodiment of the invention, the invention is directed to methods for insertional mutagenesis using a transposable system that is capable of efficient, simple, random insertion or at least insertion with reduced specificity compared to wild-type.
In a further embodiment of the invention, the invention is directed to methods for insertion of a transposable element into a DNA segment in which target site specificity is reduced from wild-type and is preferably random, where insertion is efficient and simple.
In a further embodiment of the invention, the invention is directed to methods for insertion of a transposable element into a DNA segment, by providing a transposable element containing a mutation that allows efficient and simple insertion and target site specificity that is reduced from the wild-type and is preferably random.
In a preferred embodiment of the invention, the invention is directed to methods for inserting a transposable element into a DNA segment by providing a transposable system with a mutation in an ATP-utilizing regulatory protein, the mutation allowing simple, efficient, and non-specific insertion of the transposon into a DNA segment, or at least providing reduced target site specificity from the wild-type.
In a highly preferred embodiment of the invention, the invention is directed to methods for inserting a transposable element into a DNA segment by providing a Tn7 transposable system allowing efficient, simple, non-specific insertion into a given DNA segment or at least reduced target site specificity compared to the wild-type Tn7.
In a highly preferred embodiment of the invention, the invention is directed to a Tn7 transposable system with a mutation that allows simple, efficient, and non-specific insertion of a transposable element into a DNA segment or at least provides reduced target site specificity from the wild-type Tn7.
In a highly preferred embodiment of the invention, the invention is directed to methods for inserting a transposable element into a DNA segment by providing a Tn7 transposable system with a mutation in the TnsC protein, the mutation allowing efficient and simple insertion and a reduction in target site specificity compared to the wild-type and preferably allowing non-specific or random insertion of the Tn7 transposition into a DNA segment.
In a specific disclosed embodiment of the invention, the invention is directed to methods for inserting a transposable element into a DNA segment, by providing the Tn7 mutant TnsCA225V.
The invention also provides kits for performing the above-described methods and the methods further described herein. In a preferred embodiment, a kit is supplied whose components comprise a mutant ATP-utilizing regulatory protein derived from a transposon, the mutation allowing efficient, simple, non-specific insertion of the transposon into a given DNA segment. The kit also provides a transposable element which can be found as part of a larger DNA segment; for example, a donor plasmid. The kit can further comprise a buffer compatible with insertion of the transposable element. The kit can further comprise a control target sequence, such as a control target plasmid, for determining that all of the ingredients are functioning properly. For DNA sequencing, the kit can further comprise sequencing extension primers with homology to one or more sites in the transposable element. Primers can have homology to sequences outside the transposable element (i.e. in a target vehicle).
In the kits, the mutant protein may be added as a purified protein product, may be encoded in the transposable element and produced therefrom, or encoded on vectors separate from the transposable segment, to be produced in vivo.
It is to be understood that the invention encompasses transposable systems with varying degrees of reduction of target site specificity from the wild-type which are useful for the purposes of the invention described herein.