Dual-domain polypeptides or dual-domain nucleic acids encoding such polypeptides may have new, advantageous properties compared to the original polypeptides or nucleic acids after which they are patterned. Such polypeptide domains are generally linked using a linker region or linker domain. A generic designation of such a polypeptide construct is D1-L-D2, wherein D1 and D2 are two structural domains that are identical or different and L is the linker. For example, two cytosolic domains of the membrane-spanning protein adenylyl cyclase coupled with a linker domain form a soluble protein (Tang et al., Science, 268: 1769-1772 (1995)). An advantage of this soluble form of adenylyl cyclase, which retains enzymatic activity, is that it can be produced in much higher quantities than the native enzyme (Dessauer et al., J. Biol. Chem., 16967-16974 (1996)).
Another type of polypeptide generated by linking two domains is a single chain antibody or scFv. These single chain polypeptides include the variable (V) regions from the heavy (H) and light (L) chains of a selected immunoglobulin (Ig) and recreate the antigen binding site of the native Ig while being a fraction of its size (Skerra, A. et al. (1988) Science, 240: 1038-1041; Pluckthun, A. et al. (1989) Methods Enzymol. 178: 497-515; Winter, G. et al. (1991) Nature, 349: 293-299); Bird et al. (1988) Science 242:423; Huston et al. (1988) Proc. Natl. Acad. Sci. USA 85:5879; U.S. Pat. Nos. 4,704,692, 4,853,871, 4,946,778, 5,260,203, 5,455,030. A number of U.S. patents and international patent publications of J. Huston and colleagues describe various two chain or two domain proteins, including single chain antibodies, joined by linker peptides and optionally including cleavable sites (U.S. Pat. Nos. 5,888,773, 5,877,305, 5,861,156, 5,837,846, 5,753,204, 5,534,254, 5,525,491, 5,482,858, 5,476,786, 5,330,902, 5,302,526, 5,258,498, 5,132,405, 5,091,513, 5,013,653, WO 9323537A1 (25 Nov. 1993)
An scFv is composed of a VH domain at its N-terminus and a VL domain at its C-terminus (or vice versa) linked by a peptide linker. Correct folding of the VH and VL regions is crucial for retention of antigen binding capacity by the scFv. The length and sequence of the linker region are critical parameters for correct folding and biological function. scFv chains are easier to express than the larger Fv fragments or even larger Ig molecules (which are four chain complexes).
A ribozyme is a catalytic RNA molecule that cleaves other RNA molecules that contain nucleic acid sequences complementary to particular targeting sequences in the ribozyme. Two identical or different nucleic acid domains such as two ribozyme domains can be joined to create a bifunctional ribozyme that can act on more than one RNA substrate structure. General methods for constructing ribozymes, including hairpin ribozymes, hammerhead ribozymes and RNAse P ribozymes are known in the art. Castanotto et al. (1994) Advances in Pharmacology, 25: 289-317, reviews ribozymes (including group I, hammerhead, axhead, hairpin and RNAse P). Ribozymes that can advantageously target desired specific sequences, such as HIV sequences, have been described (Ho, A. et al., WO 9426877 (1994); Yu et al. (1993) Proc. Natl. Acad. USA, 90:6340-6344, and propulic et al. (1992) J. Virol., 66:1432-1441).
The hammerhead ribozyme and the hairpin ribozyme are catalytic molecules with antisense and endoribonucleotidase activity. Their intracellular expression can confer significant resistance to, for example, HIV infection. Hammer head ribozymes are described in Rossie et al. (1991) Pharmac. Ther., 50:245-254; Forster et al. (1987) Cell, 48:211-220; Uhlenbeck, O C (1987) Nature, 328:596-600; Haseloff, J. et al. (1988) Nature, 334:334:585; propulic et al., supra; and Castanotto et al., supra, and references cited therein. Hairpin ribozyme are disclosed in Hampel et al. (1990) Nucl. Acids Res., 18:299-304; Hampel et al., EP 0360257 (1990); Haseloff, J. P. et al., U.S. Pat. No. 5,254,678 (1993); Kraus, G. et al., U.S. Pat. No. 5,958,768 (1999); Ho, A. et al., WO 9426877 (1994); Ojwang et al. (1992) Proc. Natl. Acad. USA, 89: 10802-10806; Yamada et al. (1994) Gene Therapy 1: 39-45; Leavitt et al. (1995) Proc. Natl. Acad. USA, 92: 699-703; Leavitt et al., Human Gene Therapy, 5: 1151-1120; and Yamada et al. (1994) Virology, 205: 121-126).
For convenience, the conventional single letter nucleotide code to designate positions wherein more than one base may be present is provided in Table 1.
TABLE 1For RNAFor DNAr =g or ag or a(purine)y =u or ct or c(pyrimidine)s =g or cg or cw =a or ua or tv =a, g or ca, g or cx =c, u, or ac, t, or an =a, g, c, or ua, g, c, or t(Obviously, in an r:y pairing, if r = g then y = c, etc.)
The typical substrate sequence for hairpin ribozymes is nnng/cn*gucnnnnnnnn (where n*g is the cleavage site). The hammerhead ribozyme cleaves at any nux sequence. Thus, the same substrate target within the hairpin leader sequence, guc, is targetable by the hammerhead ribozyme.
Two DNA domains can be also linked to form a dual-domain DNA molecule. Certain DNA domains bind to proteins such as DNA polymerases, endonucleases, and transcription factors. Thus, two linked DNA domains can be linked to form a dual-domain DNA molecule that binds one or more DNA binding protein.
Those skilled in the art will know of the existence of other nucleic acid or polypeptide domains that may be advantageously linked to form a dual-domain nucleic acid or polypeptide with one or more functions. Those of skill will also recognize the general desirability of methods that yield such products.
The desired property of a dual-domain DNA, ribozyme or protein molecule can be optimized by modifying the nucleic acid that (1) constitutes the DNA domain, (2) encodes the ribozyme sequence or (3) encodes the protein domain. This is achieved through a variety of conventional techniques. In one approach, the sequence or length of the linker region is varied in an effort to optimize the dual-domain molecule. The length and sequence of the linker region may indeed be critical to the function of a dual-domain protein.
Methods for generating a scFv dual-domain protein with linkers of varying peptide length are known in the art (e.g., U.S. Pat. No. 5,837,242). Changes in sequence or length of the linker can adversely affect the stability, protease susceptibility, binding activity and expression levels of the scFv. Because, the effect of a change in linker sequence or length on the function(s) of the dual-domain polypeptide has been generally unpredictable, the effect on bioactivity of varying particular amino acid residues in the linker or changing its overall length generally cannot be determined a priori.
There is thus a need for methods that permit creation of a nucleic acid library that encodes D1-L-D2 (or higher order) structures wherein L has random length and sequence. The dual-domain protein can be expressed from the library and the properties of interest can be analyzed. Once a protein is identified as having “optimal” properties, its sequence can be determined by resolving the nucleotide sequence of the clone that encodes that protein. This approach obviates the necessity of creating and testing individual clones until finding one with the desired property.
The polymerase chain reaction (PCR) has been used to generate libraries of nucleic acid products that have two domains connected by a linker having different sequences or different lengths. No currently available method permits simultaneous introduction of both random length and random sequence into the linker region of a population of nucleic acids.
Expression Systems
Many expression systems for heterologous proteins are known in the art. These include bacterial systems which have the advantages of rapid and abundant production, but are limited in many instances by their inability to produce properly folded and soluble proteins (unless the proteins are subjected to cycles of denaturation and renaturation). Baculovirus systems drive expression through the secretory pathways of insect cells, thereby increasing the probability of improved protein solubility (Kretzschmar, T. et al. (1996) J. Immunol. Methods 195:93-101; Brocks, B. et al. (1997), Immunotechnology 3:173-184). Because manipulating the virus and growing insect cells can be time consuming and costly, the system is less suitable for expression of certain types of proteins, for example tumor-specific or individual-specific proteins such as idiotypic scFv polypeptides. There is therefore a need in the art for suitable rapid and economical expression systems to produce useful dual-domain proteins, one example of which is an idiotypic scFv vaccine for treating B-cell lymphoma. The present invention addresses this need.