Modern polynucleotide synthesis techniques are utilized in high-throughput automated processes capable of generating large numbers of individual polynucleotide sequences at relatively high yields. The size of the polynucleotide sequences made using such methods may range from small primer-sized polynucleotides containing well below 50 nucleotides to massive gene, plasmid or chromosome-sized polynucleotides containing more than 1×1010 nucleotides. The high yields attained by most modern polynucleotide synthesis techniques are, in part, the result of synthesis methods that join many smaller polynucleotide sequences to create a larger individual polynucleotide sequence.
Importantly, libraries comprising very large numbers of polynucleotides with unique sequences encoding variant versions of a given recombinantly expressed protein can be synthesized using these methods. Such libraries are particularly desirable when developing protein-based therapeutics such as monoclonal antibodies and other recombinant proteins. This is because the variant proteins encoded by the individual polynucleotides in the library can later be screened for properties, such as improved in vivo half-life or binding affinity, that are sought in therapeutic proteins.
Typically, such libraries are known in the art as “fully combinatorial” libraries. This means that if two different nucleotide residues present in a parent sequence are to be changed relative to the parent sequence, all possible combinations of the two changed positions will be represented by the sequences of the individual polynucleotides in the library (FIG. 1). Stated differently, the number of unique variant sequences represented by such a library would be described by the product of the binomial coefficients for each position in the sequence to be varied minus one (one is subtracted from the product since the parent sequence cannot be considered a variant, but is one possible combination which would result). For example, in a “fully combinatorial” library where only two nucleotides residues are to be varied (e.g by mutation of positions 74 and 128 to G residues) relative to the parent sequence the number of unique variant sequences (V) in the library would be described by the following equation:V=[(C(n,k)varied position 1)*(C(n,k)varied position 2)]−1where the binomial coefficient for varied position 1 is(C(n,k)varied position 1)=n!/[k!(n−k)!]where n is the number of possible unique nucleotide residues at the position and k is the number of positions where the residues can be placed and(C(n,k)varied position 2)=n!/[k!(n−k)!]Here, (C(n,k)varied position 1)=n!/[k!(n−k)!]=2!/[1!(2−1)!]=2and C(n,k)varied position 2)=n!/[k!(n−k)!]=2!/[1!(2−1)!]=2so V=[(2)(2)]−1=3.The number of unique variants produced in this simple “fully combinatorial” library example is also easily seen to be 3 upon examination of FIG. 1.
One problem with “fully combinatorial libraries” is that this approach to library construction and synthesis is not compatible with rational mutagenesis strategies. In rational mutagenesis only individual polynucleotides containing specifically varied positions that have been rationally designed are sought-not all possible combinations of these varied positions. Synthesis of all possible combinations of variations when all that is sought is a collection of rationally designed unique, partially identical individual polynucleotide sequences inefficiently uses synthesis reagents and time.
Thus, a need exists for methods that facilitate the high throughput synthesis of a collection of unique, partially identical individual polynucleotide sequences.