This invention provides for an improved generation of novel nucleic acid modifying enzymes. The improvement is the joining of a sequence-non-specific nucleic-acid-binding domain to the enzyme in a manner that enhances the ability of the enzyme to bind and catalytically modify the nucleic acid.
The efficiency of a nucleic acid modifying enzyme, i.e., the amount of modified product generated by the enzyme per binding event, can be enhanced by increasing the stability of the modifying enzyme/nucleic acid complex. The prior art has suggested that attachment of a high probability binding site, e.g., a positively charged binding tail, to a nucleic acid modifying enzyme can increase the frequency with which the modifying enzyme interacts with the nucleic acid (see, e.g., U.S. Pat. No. 5,474,911). The present invention now provides novel modifying enzymes in which the double-stranded conformation of the nucleic acid is stabilized and the efficiency of the enzyme increased by joining a sequence-non-specific double-stranded nucleic acid binding domain to the enzyme, or its catalytic domain. The modifying proteins that are processive in nature exhibit increased processivity when joined to a binding domain compared to the enzyme alone. Moreover, both processive and non-processive modifying enzymes exhibit increased efficiency at higher temperatures when joined to a typical binding domain described herein.
The present invention provides a protein consisting of at least two heterologous domains wherein a first domain that is a sequence-non-specific double-stranded nucleic acid binding domain is joined to a second domain that is a catalytic nucleic acid modifying domain having a processive nature, where the presence of the sequence-non-specific double-stranded nucleic acid binding domain enhances the processive nature of the nucleic acid modifying domain compared to an identical protein not having a sequence-non-specific nucleic acid binding domain joined thereto. In one aspect of the invention, the nucleic acid modifying domain can have a polymerase activity, which can be thermally stable, e.g., a Thermus polymerase domain. In alternative embodiments, the catalytic domain is an RNA polymerase, a reverse transcriptase, a methylase, a 3xe2x80x2 or 5xe2x80x2 exonuclease, a gyrase, or a topoisomerase.
In a particular embodiment, a sequence-non-specific nucleic acid binding domain of the protein can specifically bind to polyclonal antibodies generated against Sac7d or Sso7d. Alternatively, the sequence-non-specific nucleic acid binding domain can contain a 50 amino acid subsequence that has 50% amino acid similarity to Sso7d. The nucleic acid binding domain can also be Sso7d.
In another embodiment, a protein of the invention contains a sequence-non-specific double-stranded nucleic acid binding domain that specifically binds to polyclonal antibodies generated against a PCNA homolog of Pyrococcus furiosus, or can be a PCNA homolog of Pyrococcus furiosus. 
The invention also provides a protein consisting of at least two heterologous domains, wherein a first domain that is a sequence-non-specific double-stranded nucleic acid binding domain is joined to a second domain that is a catalytic nucleic-acid-modifying domain, where the presence of the sequence-non-specific nucleic-acid binding domain stabilizes the double-stranded conformation of a nucleic acid by at least 1xc2x0 C. compared to an identical protein not having a sequence-non-specific nucleic acid binding domain joined thereto. The nucleic acid modifying domain of such a protein can have polymerase activity, which can be thermally stable. The nucleic-acid-modifying domain can also have RNA polymerase, reverse transcriptase, methylase, 3xe2x80x2 or 5xe2x80x2 exonuclease, gyrase, or topoisomerase activity.
In further embodiments, the sequence-non-specific nucleic-acid-binding domain can specifically bind to polyclonal antibodies generated against either Sac7d or Sso7d, frequently Sso7d, or contains a 50 amino acid subsequence containing 50% or 75% amino acid similarity to Sso7d. Often, the sequence-non-specific nucleic-acid-binding domain is Sso7d.
Proteins of the invention include a protein wherein the sequence-non-specific nucleic-acid-binding domain specifically binds to polyclonal antibodies generated against the PCNA homolog of Pyrococcus furiosus; often the binding domain is the PCNA homolog of Pyrococcus furiosus. 
In another aspect, the invention provides methods of modifying nucleic acids using the proteins. One embodiment is a method of modifying a nucleic acid in an aqueous solution by: (i) contacting the nucleic acid with a protein comprising at least two heterologous domains, wherein a first domain that is a sequence-non-specific nucleic-acid-binding domain is joined to a second domain that is a catalytic nucleic-acid-modifying domain having a processive nature, where the sequence-non-specific nucleic-acid-binding domain: a. binds to double-stranded nucleic acid, and b. enhances the processivity of the enzyme compared to an identical enzyme not having the sequence non-specific nucleic-acid-binding domain fused to it, and wherein the solution is at a temperature and of a composition that permits the binding domain to bind to the nucleic acid and the enzyme to function in a catalytic manner; and (ii) permitting the catalytic domain to modify the nucleic acid in the solution.
In another aspect, the invention provides a method of modifying a nucleic acid by: (i) contacting the nucleic acid with an aqueous solution containing a protein having at least two heterologous domains, wherein a first domain that is a sequence-non-specific double-stranded nucleic-acid-binding domain is joined to a second domain that is a catalytic nucleic-acid-modifying domain, where the presence of the sequence-non-specific nucleic-acid-binding domain stabilizes the formation of a double-stranded nucleic acid compared to an otherwise identical protein not having the sequence-non-specific nucleic-acid-binding domain joined to it; and, wherein the solution is at a temperature and of a composition that permits the binding domain to bind to the nucleic acid and the enzyme to function in a catalytic manner; and (ii) permitting the catalytic domain to modify the nucleic acid in the solution. The methods of modifying a nucleic acid can employ any of the protein embodiments described herein.
xe2x80x9cArchaeal small basic DNA-binding proteinxe2x80x9d refers to protein of between 50-75 amino acids having either 50% homology to a natural Archaeal small basic DNA-binding protein such as Sso-7d from Sulfolobus sulfataricus or binds to antibodies generated against a native Archaeal small basic DNA-binding protein.
xe2x80x9cCatalytic nucleic-acid-modifying domains having a processive naturexe2x80x9d refers to a protein sequence or subsequence that performs as an enzyme having the ability to slide along the length of a nucleic acid molecule and chemically alter its structure repeatedly. A catalytic domain can include an entire enzyme, a subsequence thereof, or can include additional amino acid sequences that are not attached to the enzyme or subsequence as found in nature.
xe2x80x9cDomainxe2x80x9d refers to a unit of a protein or protein complex, comprising a polypeptide subsequence, a complete polypeptide sequence, or a plurality of polypeptide sequences where that unit has a defined function. The function is understood to be broadly defined and can be ligand binding, catalytic activity or can have a stabilizing effect on the structure of the protein.
xe2x80x9cEfficiencyxe2x80x9d in the context of a nucleic acid modifying enzyme of this invention refers to the ability of the enzyme to perform its catalytic function under specific reaction conditions. Typically, xe2x80x9cefficiencyxe2x80x9d as defined herein is indicated by the amount of modified bases generated by the modifying enzyme per binding to a nucleic acid.
xe2x80x9cEnhancesxe2x80x9d in the context of an enzyme refers to improving the activity of the enzyme, i.e., increasing the amount of product per unit enzyme per unit time.
xe2x80x9cFusedxe2x80x9d refers to linkage by covalent bonding.
xe2x80x9cHeterologousxe2x80x9d, when used with reference to portions of a protein, indicates that the protein comprises two or more domains that are not found in the same relationship to each other in nature. Such a protein, e.g., a fusion protein, contains two or more domains from unrelated proteins arranged to make a new functional protein.
xe2x80x9cJoinxe2x80x9d refers to any method known in the art for functionally connecting protein domains, including without limitation recombinant fusion with or without intervening domains, intein-mediated fusion, non-covalent association, and covalent bonding, including disulfide bonding; hydrogen bonding; electrostatic bonding; and conformational bonding, e.g., antibody-antigen, and biotin-avidin associations.
xe2x80x9cMethylasexe2x80x9d refers to an enzyme that can modify a nucleic acid by the addition of a methyl group to a nucleotide.
xe2x80x9cNucleasexe2x80x9d refers to an enzyme capable of cleaving the phosphodiester bonds between nucleotide subunits of nucleic acids.
xe2x80x9cNucleic-acid-modifying enzymexe2x80x9d refers to an enzyme that covalently alters a nucleic acid.
xe2x80x9cPolymerasexe2x80x9d refers to an enzyme that performs template-directed synthesis of polynucleotides.
xe2x80x9cProcessivityxe2x80x9d refers to the ability of a nucleic acid modifying enzyme to remain attached to the template or substrate and perform multiple modification reactions. Typically xe2x80x9cprocessivityxe2x80x9d refers to the ability to modify relatively long tracts of nucleic acid.
xe2x80x9cRestriction Endonucleasexe2x80x9d refers to any of a group of enzymes, produced by bacteria, that cleave molecules of DNA internally at specific base sequences.
xe2x80x9cSequence-non-specific nucleic-acid-binding domainxe2x80x9d refers to a protein domain which binds with significant affinity to a nucleic acid, for which there is no known nucleic acid which binds to the protein domain with more than 1 00-fold more affinity than another nucleic acid with the same nucleotide composition but a different nucleotide sequence.
xe2x80x9cThermally stable polymerasexe2x80x9d as used herein refers to any enzyme that catalyzes polynucleotide synthesis by addition of nucleotide units to a nucleotide chain using DNA or RNA as a template and has an optimal activity at a temperature above 45xc2x0 C.
xe2x80x9cThermus polymerasexe2x80x9d refers to a family A DNA polymerase isolated from any Thermus species, including without limitation Thermus aquaticus, Thermus brockianus, and Thermus thermophilus; any recombinant enzymes deriving from Thermus species, and any functional derivatives thereof, whether derived by genetic modification or chemical modification or other methods known in the art.
FIGS. 1A, 1B, and 1C show the results of PCR amplification reactions performed using primers of different lengths to compare the efficiency of Sso7d-modified polymerase with the unmodified full-length polymerase. FIG. 1A: PCR amplification with a 22 nt forward primer; FIG. 1B: PCR amplification with a 15 nt primer; FIG. 1C PCR amplification with a 12 nt primer.
FIG. 2 shows the results of a PCR amplification reaction using a 12 nt forward primer to evaluate the PCR products generated using Sac7d-xcex94Taq compared to Taq.