The present invention is generally directed to the fields of genetic and protein engineering and molecular biology. In particular, the invention provides methods for identifying and purifying double-stranded polynucleotides lacking base pair mismatches, insertion/deletion loops and nucleotide gaps.
Synthetic oligonucleotides are commonly used to construct nucleic acids, including polypeptide coding sequences and gene constructs. However, even the best oligonucleotide synthesizer has a 1% to 5% error rate. These errors can result in improper base pair sequences, which can lead to generation of an erroneous protein sequences. These errors can also result in sequences that cannot be properly transcribed or untranslated, including, e.g., premature stop codons. To detect these errors, the oligonucleotides or the sequences generated using the oligonucleotides are sequenced. However, sequencing to detect errors in nucleic acid synthetic techniques is time consuming and expensive.
The invention provides a method for purifying double-stranded polynucleotides lacking base pair mismatches, insertion/deletion loops and/or nucleotide gaps comprising the following steps: (a) providing a plurality of polypeptides that specifically bind to a base pair mismatch, an insertion/deletion loop and/or a nucleotide gap or gaps within a double stranded polynucleotide; (b) providing a sample comprising a plurality of double-stranded polynucleotides; (c) contacting the double-stranded polynucleotides of step (b) with the polypeptides of step (a) under conditions wherein a polypeptide of step (a) can specifically bind to a base pair mismatch, an insertion/deletion loop and/or a nucleotide gap or gaps in a double stranded polynucleotide of step (b); and (d) separating the double-stranded polynucleotides lacking a specifically bound polypeptide of step (a) from the double-stranded polynucleotides to which a polypeptide of step (a) has specifically bound, thereby purifying double-stranded polynucleotides lacking base pair mismatches, insertion/deletion loops and/or nucleotide gaps. In one aspect, the double-stranded polynucleotide comprises a double-stranded oligonucleotide.
In alternative aspects, the double-stranded polynucleotide is between about 3 and about 300 base pairs in length; between 10 and about 200 base pairs in length; and, between 50 and about 150 base pairs in length. In alternative aspects, the gaps in the double-stranded polynucleotide are between about 1 and 30, about 2 and 20, about 3 and 15, about 4 and 12 and about 5 and 10 nucleotides in length.
In alternative aspects, thee the base pair mismatch comprises a C:T mismatch, a G:A mismatch, a C:A mismatch or a G:U/T mismatch.
In one aspect, the polypeptide that specifically binds to a base pair mismatch, an insertion/deletion loop and/or a nucleotide gap or gaps in a double stranded polynucleotide comprises a DNA repair enzyme. In alternative aspects, the DNA repair enzyme is a bacterial DNA repair enzyme, a MutS DNA repair enzyme, a Taq MutS DNA repair enzyme, an Fpg DNA repair enzyme, a MutY DNA repair enzyme, a hexA DNA mismatch repair enzyme, a Vsr mismatch repair enzyme, a mammalian DNA repair enzyme and natural or synthetic variations and isozymes thereof. In one aspect, the DNA repair enzyme is a DNA glycosylase that initiates base-excision repair of G:U/T mismatches. The DNA glycosylase can comprise a bacterial mismatch-specific uracil-DNA glycosylase (MUG) DNA repair enzyme or a eukaryotic thymine-DNA glycosylase (TDG) enzyme.
In one aspect, the separating of the double-stranded polynucleotides lacking a specifically bound polypeptide of step (a) from the double-stranded polynucleotides to which a polypeptide of step (a) has specifically bound of step (d) comprises use of an immunoaffinity column, wherein the column comprises immobilized antibodies capable of specifically binding to the specifically bound polypeptide or an epitope bound to the specifically bound polypeptide, and the sample is passed through the immunoaffinity column under conditions wherein the immobilized antibodies are capable of specifically binding to the specifically bound polypeptide or the epitope bound to the specifically bound polypeptide.
In one aspect, the separating of the double-stranded polynucleotides lacking a specifically bound polypeptide of step (a) from the double-stranded polynucleotides to which a polypeptide of step (a) has specifically bound of step (d) comprises use of an antibody, wherein the antibody is capable of specifically binding to the specifically bound polypeptide or an epitope bound to the specifically bound polypeptide and the antibody is contacted with the specifically bound polypeptide under conditions wherein the antibodies are capable of specifically binding to the specifically bound polypeptide or an epitope bound to the specifically bound polypeptide. The antibody can be an immobilized antibody. The antibody can be immobilized onto a bead or a magnetized particle or a magnetized bead.
In one aspect, the separating of the double-stranded polynucleotides lacking a specifically bound polypeptide of step (a) from the double-stranded polynucleotides to which a polypeptide of step (a) has specifically bound of step (d) comprises use of an affinity column, wherein the column comprises immobilized binding molecules capable of specifically binding to a tag linked to the specifically bound polypeptide and the sample is passed through the affinity column under conditions wherein the immobilized antibodies are capable of specifically binding to the tag linked to the specifically bound polypeptide. The immobilized binding molecules can comprise an avidin or a natural or synthetic variation or homologue thereof and the tag linked to the specifically bound polypeptide can comprise a biotin or a natural or synthetic variation or homologue thereof.
In one aspect, the separating of the double-stranded polynucleotides lacking a specifically bound polypeptide of step (a) from the double-stranded polynucleotides to which a polypeptide of step (a) has specifically bound of step (d) comprises use of a size exclusion column, such as a spin column. Alternatively, the separating can comprise use of a size exclusion gel, such as an agarose gel.
In one aspect, the double-stranded polynucleotide comprises a polypeptide coding sequence. The polypeptide coding sequence can comprise a fusion protein coding sequence. The fusion protein can comprise a polypeptide of interest upstream of an intein, wherein the intein comprises a polypeptide. The intein polypeptide can comprise an enzyme, such as one used to identify vector or insert positive clones, such as Lac Z. The intein polypeptide can comprise an antibody or a ligand. In one aspect, the intein polypeptide comprises a polypeptide selectable marker, such as an antibiotic. The antibiotic can comprise a kanamycin, a penicillin or a hygromycin.
The invention provides a method for assembling double-stranded oligonucleotides to generate a polynucleotide lacking base pair mismatches, insertion/deletion loops and/or nucleotide gaps comprising the following steps: (a) providing a plurality of polypeptides that specifically bind to a base pair mismatch, an insertion/deletion loop and/or a nucleotide gap or gaps in a double stranded polynucleotide; (b) providing a sample comprising a plurality of double-stranded oligonucleotides; (c) contacting the double-stranded oligonucleotides of step (b) with the polypeptides of step (a) under conditions wherein a polypeptide of step (a) can specifically bind to a base pair mismatch, an insertion/deletion loop and/or a nucleotide gap or gaps in a double stranded oligonucleotide of step (b); (d) separating the double-stranded oligonucleotides lacking a specifically bound polypeptide of step (a) from the double-stranded oligonucleotides to which a polypeptide of step (a) has specifically bound, thereby purifying double-stranded oligonucleotides lacking base pair mismatches, insertion/deletion loops and/or a nucleotide gap or gaps; and (e) joining together the purified double-stranded oligonucleotides lacking base pair mismatches and insertion/deletion loops, thereby generating a polynucleotide lacking base pair mismatches, insertion/deletion loops and/or nucleotide gaps.
The invention provides a method for generating a polynucleotide lacking base pair mismatches, insertion/deletion loops and/or nucleotide gaps comprising the following steps: (a) providing a plurality of polypeptides that specifically bind to a base pair mismatch, an insertion/deletion loop and/or a nucleotide gap or gaps in a double stranded polynucleotide; (b) providing a sample comprising a plurality of double-stranded oligonucleotides; (c) joining together the double-stranded oligonucleotides of step (b) to generate a double-stranded polynucleotide; (d) contacting the double-stranded polynucleotide of step (c) with the polypeptides of step (a) under conditions wherein a polypeptide of step (a) can specifically bind to a base pair mismatch, an insertion/deletion loop and/or a nucleotide gap or gaps in a double stranded polynucleotide of step (c); and (e) separating the double-stranded polynucleotides lacking a specifically bound polypeptide of step (a) from the double-stranded polynucleotides to which a polypeptide of step (a) has specifically bound, thereby purifying double-stranded polynucleotides lacking base pair mismatches, insertion/deletion loops and/or nucleotide gaps.
The invention provides a method for generating a base pair mismatch-free, insertion/deletion loop-free and/or gap-free double-stranded polypeptide coding sequence comprising the following steps: (a) providing a plurality of polypeptides that specifically bind to a base pair mismatch, an insertion/deletion loop and/or a nucleotide gap or gaps within a double stranded polynucleotide; (b) providing a sample comprising a plurality of double-stranded polynucleotides encoding a fusion protein, wherein the fusion protein coding sequence comprises a coding sequence for a polypeptide of interest upstream of and in frame with a coding sequence for a marker or a selection polypeptide; (c) contacting the double-stranded polynucleotides of step (b) with the polypeptides of step (a) under conditions wherein a polypeptide of step (a) can specifically bind to a base pair mismatch, an insertion/deletion loop and/or a nucleotide gap or gaps in a double stranded polynucleotide of step (b); (d) separating the double-stranded polynucleotides lacking a specifically bound polypeptide of step (a) from the double-stranded polynucleotides to which a polypeptide of step (a) has specifically bound, thereby purifying double-stranded polynucleotides lacking base pair mismatches, insertion/deletion loops and/or a nucleotide gap or gaps; (e) expressing the purified double-stranded polynucleotides and selecting the polynucleotides expressing the selection marker polypeptide, thereby generating a base pair mismatch-free, insertion/deletion loop-free and/or gap-free double-stranded polypeptide coding sequence.
In one aspect, the marker or selection polypeptide comprises a self-splicing intein, and the method further comprises the self-splicing out of the intein marker or selection polypeptide from the upstream polypeptide of interest. The marker or selection polypeptide can comprise an enzyme, such as a enzyme used to identity insert or vector-positive clones, such as a LacZ enzyme. The marker or selection polypeptide can also comprise an antibiotic, such as a kanamycin, a penicillin or a hygromycin.
In alternative aspects of the invention, the methods generate a sample or xe2x80x9cbatchxe2x80x9d of purified oligonucleotides and/or polynucleotides that are 90%, 95%, 96%, 97%, 98%, 99%, 99.5% and 100% or completely free of base pair mismatches, insertion/deletion loops and/or a nucleotide gap or gaps.