1. Field of the Invention
This invention relates to compositions, methods and kits for generating libraries of recombinant expression vectors and using these libraries in screening of affinity-binding pairs, and, more particularly, for generating libraries of recombinant human antibodies and screening for their affinity binding with target antigens.
2. Description of Related Art
Antibodies are a diverse class of molecules. Delves, P. J. (1997) xe2x80x9cAntibody production: essential techniquesxe2x80x9d, New York, John Wiley and Sons, pp. 90-113. It is estimated that even in the absence of antigen stimulation a human makes at least 1015 different antibody moleculesxe2x80x94its Permian antibody repertoire. The antigen-binding sites of many antibodies can cross-react with a variety of related but different antigenic determinants, and the Permian repertoire is apparently large enough to ensure that there will be an antigen-binding site to fit almost any potential antigenic determinant, albeit with low affinity.
Structurally, antibodies or immunoglobulins (Igs) are composed of one or more Y-shaped units. For example, immunoglobulin G (IgG) has a molecular weight of 150 kDa and consists of just one of these units. Typically, an antibody can be proteolytically cleaved by the proteinase papain into two identical Fab (fragment antigen binding) fragments and one Fc (fragment crystallizable) fragment. Each Fab contains one binding site for antigen, and the Fc portion of the antibodies mediates other aspects of the immune response.
A typical antibody contains four polypeptides-two identical copies of a heavy (H) chain and two copies of a light (L) chain, forming a general formula H2L2. Each L chain is attached to one H chain by a disulfide bond. The two H chains are also attached to each other by disulfide bonds. Papain cleaves N-terminal to the disulfide bonds that hold the H chains together. Each of the resulting Fabs consists of an entire L chain plus the N-terminal half of an H chain; the Fc is composed of the C-terminal halves of two H chains. Pepsin cleaves at numerous sites C-terminal to the inter-H disulfide bonds, resulting in the formation of a divalent fragment [F(abxe2x80x2)] and many small fragments of the Fc portion. IgG heavy chains contain one N-terminal variable (VH) plus three C-terminal constant (CH1, CH2 and CH3) regions. Light chains contain one N-terminal variable (VL) and one C-terminal constant (CL) region each. The different variable and constant regions of either heavy or light chains are of roughly equal length (about 110 amino residues per region). Fabs consist of one VL, VH, CH1, and CL region each. The VL and VH portions contain hypervariable segments (complementarity-determining regions or CDR) that form the antibody combining site.
The VL and VH portions of a monoclonal antibody have also been linked by a synthetic linker to form a single chain protein (scFv) which retains the same specificity and affinity for the antigen as the monoclonal antibody itself. Bird, R. E., et al. (1988) xe2x80x9cSingle-chain antigen-binding proteinsxe2x80x9d Science 242:423-426. A typical scFv is a recombinant polypeptide composed of a VL tethered to a VH by a designed peptide, such as (Gly4-Ser)3 (SEQ ID NO: 80), that links the carboxyl terminus of the VL to the amino terminus of the VH sequence. The construction of the DNA sequence encoding a scFv can be achieved by using a universal primer encoding the (Gly4-Ser)3 linker by polymerase chain reactions (PCR). Lake, D. F., et al. (1995) xe2x80x9cGeneration of diverse single-chain proteins using a universal (Gly4-Ser)3 (SEQ ID NO: 80) encoding oligonucleotidexe2x80x9d Biotechniques 19:700-702.
The mammalian immune system has evolved unique genetic mechanisms that enable it to generate an almost unlimited number of different light and heavy chains in a remarkably economical way by joining separate gene segments together before they are transcribed. For each type of Ig chainxe2x80x94xcexa light chains, xcex light chains, and heavy chainxe2x80x94there is a separate pool of gene segments from which a single peptide chain is eventually synthesized. Each pool is on a different chromosome and usually contains a large number of gene segments encoding the V region of an Ig chain and a smaller number of gene segments encoding the C region. During B cell development a complete coding sequence for each of the two Ig chains to be synthesized is assembled by site-specific genetic recombination, bringing together the entire coding sequences for a V region and the coding sequence for a C region. In addition, the V region of a light chain is encoded by a DNA sequence assembled from two gene segmentsxe2x80x94a V gene segment and short joining or J gene segment. The V region of a heavy chain is encoded by a DNA sequence assembled from three gene segmentsxe2x80x94a V gene segment, a J gene segment and a diversity or D segment.
The large number of inherited V, J and D gene segments available for encoding Ig chains makes a substantial contribution on its own to antibody diversity, but the combinatorial joining of these segments greatly increases this contribution. Further, imprecise joining of gene segments and somatic mutations introduced during the V-D-J segment joining at the pre-B cell stage greatly increases the diversity of the V regions.
After immunization against an antigen, a mammal goes through a process known as affinity maturation to produce antibodies with higher affinity toward the antigen. Such antigen-driven somatic hypermutation fine-tunes antibody responses to a given antigen, presumably due to the accumulation of point mutations specifically in both heavy-and light-chain V region coding sequences and a selected expansion of high-affinity antibody-bearing B cell clones.
Great efforts have been made to mimic such a natural maturation of antibodies against various antigens, especially antigens associated with diseases such as autoimmune diseases, cancer, AIDS and asthma. In particular, phage display technology has been used extensively to generate large libraries of antibody fragments by exploiting the capability of bacteriophage to express and display biologically functional protein molecule on its surface. Combinatorial libraries of antibodies have been generated in bacteriophage lambda expression systems which may be screened as bacteriophage plaques or as colonies of lysogens (Huse et al. (1989) Science 246: 1275; Caton and Koprowski (1990) Proc. Natl. Acad. Sci. (U.S.A.) 87: 6450; Mullinax et al (1990) Proc. Natl. Acad. Sci. (U.S.A.) 87: 8095; Persson et al. (1991) Proc. Natl. Acad. Sci. (U.S.A.) 88: 2432). Various embodiments of bacteriophage antibody display libraries and lambda phage expression libraries have been described (Kang et al. (1991) Proc. Natl. Acad. Sci. (U.S.A.) 88: 4363; Clackson et al. (1991) Nature 352: 624; McCafferty et al. (1990) Nature 348: 552; Burton et al. (1991) Proc. Natl. Acad. Sci. (U.S.A.) 88: 10134; Hoogenboom et al. (1991) Nucleic Acids Res. 19: 4133; Chang et al. (1991) J. Immunol. 147: 3610; Breitling et al. (1991) Gene 104: 147; Marks et al. (1991) J. Mol. Biol. 222: 581; Barbas et al. (1992) Proc. Natl. Acad. Sci. (U.S.A.) 89: 4457; Hawkins and Winter (1992) J. Immunol. 22: 867; Marks et al. (1992) Biotechnology 10: 779; Marks et al. (1992) J. Biol. Chem. 267: 16007; Lowman et al (1991) Biochemistry 30: 10832; Lerner et al. (1992) Science 258: 1313). Also see review by Rader, C. and Barbas, C. F. (1997) xe2x80x9cPhage display of combinatorial antibody librariesxe2x80x9d Curr. Opin. Biotechnol. 8:503-508.
Various scFv libraries displayed on bacteriophage coat proteins have been described. Marks et al. (1992) Biotechnology 10: 779; Winter G and Milstein C (1991) Nature 349: 293; Clackson et al. (1991) op.cit.; Marks et al. (1991) J. Mol. Biol. 222: 581; Chaudhary et al. (1990) Proc. Natl. Acad. Sci. (USA) 87: 1066; Chiswell et al. (1992) TIBTECH 10: 80; and Huston et al. (1988) Proc. Natl. Acad. Sci. (USA) 85: 5879.
Generally, a phage library is created by inserting a library of a random oligonucleotide or a cDNA library encoding antibody fragment such as VL and VH into gene 3 of M13 or fd phage. Each inserted gene is expressed at the N-terminal of the gene 3 product, a minor coat protein of the phage. As a result, peptide libraries that contain diverse peptides can be constructed. The phage library is then affinity screened against immobilized target molecule of interest, such as an antigen, and specifically bound phages are recovered and amplified by infection into Escherichia coli host cells. Typically, the target molecule of interest such as a receptor (e.g., polypeptide, carbohydrate, glycoprotein, nucleic acid) is immobilized by covalent linkage to a chromatography resin to enrich for reactive phage by affinity chromatography) and/or labeled for screen plaques or colony lifts. This procedure is called biopanning. Finally, amplified phages can be sequenced for deduction of the specific peptide sequences. During the inherent nature of phage display, the antibodies displayed on the surface of the phage may not adopt its native conformation under such in vitro selection conditions as in a mammalian system. In addition, bacteria do not readily process, assemble, or express/secrete functional antibodies.
Transgenic animals such as mice have been used to generate fully human antibodies by using the XENOMOUSE(trademark) technology developed by companies such as Abgenix, Inc., Fremont, Calif. and Medarex, Inc. Annandale, N.J. Strains of mice are engineered by suppressing mouse antibody gene expression and functionally replacing it with human antibody gene expression. This technology utilizes the natural power of the mouse immune system in surveillance and affinity maturation to produce a broad repertoire of high affinity antibodies. However, the breeding of such strains of transgenic mice and selection of high affinity antibodies can take a long period of time. Further, the antigen against which the pool of the human antibody is selected has to be recognized by the mouse as a foreign antigen in order to mount immune response; antibodies against a target antigen that does not have immunogenicity in a mouse may not be able selected by using this technology. In addition, there may be a regulatory issue regarding the use of transgenic animals, such as transgenic goats (developed by Genzyme Transgenics, Framingham, Mass.) and chickens (developed by Geneworks, Inc., Ann Arbor, Mich.), to produce antibody, as well as safety issues concerning containment of transgenic animals infected with recombinant viral vectors.
Antibodies and antibody fragments have also been produced in transgenic plants. Plants, such as corn plants (developed by Integrated Protein Technologies, St. Louis, Mo.), are transformed with vectors carrying antibody genes, which results in stable integration of these foreign genes into the plant genome. In comparison, most microorganisms transformed with plasmids can lose the plasmids during a prolonged fermentation. Transgenenic plant may be used as a cheaper means to produce antibody in large scales. However, due to the long growth circles of plants screening for antibody with high binding affinity toward a target antigen may not be efficient and feasible for high throughput screening in plants.
The present invention provides compositions, methods, and kits for efficiently generating and screening protein complexes for their ability to bind to other proteins or oligonucleotide sequences. One feature of the present invention is the production of two or more polypeptides which self-assemble to form a protein complex in vivo. The in vivo formed protein complex is then tested in the same in vivo system for the complex""s ability to bind to either a protein or a nucleotide sequence (DNA or RNA). The ability to express polypeptides, form protein complexes of those polypeptides, and screen the protein complexes all in the same intracellular system enables the present invention to screen large populations of protein complexes for binding with high throughput.
In one aspect of the present invention, compositions are provided. These compositions may be used for screening affinity-binding pairs between a tester protein complex and a target molecule in vitro or in vivo. The target molecule may be a protein, peptide, DNA, RNA, or small molecules.
In one embodiment, a library of yeast expression vectors is provided which express the protein complex to be screened. The yeast expression vectors forming the library comprise a first nucleotide sequence encoding a first polypeptide subunit; and a second nucleotide sequence encoding a second polypeptide subunit, the first and second nucleotide sequences each independently varying within the library of expression vectors.
According to this embodiment, the first polypeptide subunit and the second polypeptide subunit can be expressed as separate proteins or peptides. This may be accomplished by expressing the first and second polypeptide subunits from separate promoters, or by expressing the polypeptide subunits bicistronically from the same promoter via an internal ribosomal entry site (IRES) or via a splicing donor-acceptor mechanism.
Also according to the embodiment, the yeast expression vector may be a 2xcexc plasmid or a yc-type (centromeric) vector, preferably a yeast-bacterial shuttle vector which contains a bacterial origin of replication.
Also according to the embodiment, the first polypeptide subunit and/or the second polypeptide can be expressed as a fusion protein with a cell wall/membrane protein, such as the yeast agglutinin cell wall protein. Such a fusion allows transportation of the protein complex (e.g. antibody) formed between the first and second subunits to the cell wall/membrane, thus effectively mimicking the cell surface display of antibodies by B cells in the immune system for affinity maturation in vivo.
Alternatively, the first polypeptide subunit or the second polypeptide can be expressed as a fusion protein with nucleus protein, such as the nucleus transportation domain of a transcription factor. Such a fusion allows transportation of the protein complex (e.g. antibody) formed between the first and second subunits to the nucleus where interaction of the antibody with nuclear target(s) occurs.
In another embodiment, a library of expression vectors is provided. The expression vectors forming in the library comprise: a transcription sequence encoding an activation domain or a DNA binding domain of a transcription activator; a first nucleotide sequence encoding a first polypeptide subunit; and a second nucleotide sequence encoding a second polypeptide subunit, the first and second nucleotide sequence each independently varying within the library of expression vectors.
The activation domain or the DNA binding domain of the transcription activator and the first polypeptide subunit are expressed as a single fusion protein. The second polypeptide subunit is expressed as a separate protein or peptide from the first polypeptide.
According to this embodiment, the expression vector may be a bacterial, phage, yeast, mammalian and viral expression vector, preferably a yeast expression vector, and more preferably a 2xcexc plasmid yeast expression vector.
Also according to this embodiment, the transcription activator sequence may be located 5xe2x80x2 relative to the first nucleotide sequence. Alternatively, the transcription activator sequence may be located 3xe2x80x2 relative to the first nucleotide sequence.
In yet another embodiment, a library of transformed yeast cells is provided. The library of yeast cells comprises a library of yeast expression vectors. The expression vectors in the library of transformed yeast cells comprise: a transcription sequence encoding an activation domain or a DNA binding domain of a transcription activator; a first nucleotide sequence encoding a first polypeptide subunit; and a second nucleotide sequence encoding a second polypeptide subunit, the first and second nucleotide sequence each independently varying within the library of expression vectors. The activation domain or the DNA binding domain of the transcription activator and the first polypeptide subunit are expressed as a single fusion protein. The second polypeptide subunit is expressed as a separate protein or peptide from the first polypeptide.
According to this embodiment, the yeast cells may be diploid yeast cells. Alternatively, the yeast cells may be haploids such as the a and xcex1 strain of yeast haploid cells.
In another aspect of the present invention, methods are provided for generating a library of yeast expression vectors that may be used for screening protein-protein or protein-DNA binding pairs.
In one embodiment, the method comprises: transforming into yeast cells a library of insert nucleotide sequences that are linear and double-stranded, and a library of linearized yeast expression vectors, each having a 5xe2x80x2- and 3xe2x80x2-terminus sequence at the site of linearization.
The linearized yeast expression vectors of the vector library comprise a first polynucleotide sequence encoding a first polypeptide subunit which varies within the vector library. The insert sequences of the insert library comprise a second nucleotide sequence encoding a second polypeptide subunit which varies within the insert library. Each of the insert sequences also comprises a 5xe2x80x2- and 3xe2x80x2-flanking sequence at the respective ends of the insert sequence. The 5xe2x80x2- and 3xe2x80x2-flanking sequences of the insert sequence are sufficiently homologous to the 5xe2x80x2- and 3xe2x80x2-terminus sequences of the linearized yeast expression vector, respectively, to enable homologous recombination to occur.
Homologous recombination occurring between the vector and the insert sequence results in inclusion of the insert sequence into the vector in the transformed yeast cells. Since the first and second nucleotide sequences vary independently within the insert library (having a complexity of 10x) and vector library (having a complexity of 10y), respectively, the complexity of the library formed as a result of homologous recombination should theoretically be 10x+y.
In this embodiment, the first polypeptide subunit and the second polypeptide subunit are expressed as separate proteins or peptides. This may be accomplished by expressing the first and second polypeptide subunits from separate promoters on the vector, or by expressing the polypeptide subunits bicistronically from the same promoter on the vector via an internal ribosomal entry site (IRES) or via a splicing donor-acceptor mechanism.
According to the embodiment, the 5xe2x80x2- and 3xe2x80x2-flanking sequences of the insert sequence is preferably between about 30-120 bp in length, more preferably between about 40-90 bp in length, and most preferably between about 45-55 bp in length.
According to the embodiment, the vector library comprising the second nucleotide sequences may be constructed by directional cloning of a library of the second nucleotide sequence inserts into a yeast expression vector in bacteria. Alternatively, the vector library may be constructed by inserting a library of the second nucleotide sequence inserts into a yeast expression vector via homologous recombination in yeast. Homologous recombination in yeast is preferred due to its higher transformation efficiency.
In yet another aspect of the present invention, methods are provided for selecting tester protein complexes capable of binding to a target peptide, protein, or DNA.
In an embodiment where the target molecule is a target peptide or protein, the method comprises:
expressing a library of tester protein complexes in yeast cells, each tester protein complex being formed between a first polypeptide subunit whose sequence varies within the library, and a second polypeptide subunit whose sequence varies within the library independently of the first polypeptide; expressing one or more target fusion proteins in the yeast cells expressing the tester proteins, each of the target fusion proteins comprising a target peptide or protein; and
selecting those yeast cells in which a reporter gene is expressed, the expression of the reporter gene being activated by binding of the tester protein complex to the target fusion protein.
According to this embodiment, expression of the reporter gene may be activated by a functional transcription activator being formed by the binding of the tester protein complex to the target peptide or protein as in a yeast two-hybrid system.
In a variation of the embodiment employing the yeast two-hybrid system, the tester protein forms a portion of a fusion protein with either a DNA binding domain or an activation domain of a transcriptional activator. The target protein meanwhile forms a portion of a fusion protein comprising the DNA binding domain or the activation domain of the transcriptional activator which is not present in the fusion protein comprising the tester protein. If the tester protein is able to bind to the target protein, a functional transcriptional activator is formed.
According to this variation, the step of expressing the library of tester protein complexes may include transforming a library of tester expression vectors into the yeast cells which contain a reporter construct comprising the reporter gene whose expression is under transcriptional control of a transcription activator comprising an activation domain and a DNA binding domain.
Each of the tester expression vectors comprises a first transcription sequence encoding either the activation domain or the DNA binding domain of the transcription activator, a first nucleotide sequence encoding the first polypeptide subunit, and a second nucleotide sequence encoding the second polypeptide subunit, the first and second nucleotide sequences varying independently within the library of tester expression vectors. The domain encoded by the first transcription sequence and the first polypeptide subunit are expressed as a fusion protein. The first and second polypeptide subunits are expressed as separate proteins, and form the tester protein complex upon binding with each other through non-covalent interactions (e.g. hydrophobic interactions) or covalent interactions (e.g. disulfide bonds).
Optionally, the step of expressing the target fusion proteins includes transforming a target expression vector into the yeast cells simultaneously or sequentially with the library of tester expression vectors. The target expression vector comprises a second transcription sequence encoding either the activation domain or the DNA binding domain of the transcription activator which is not expressed by the library of tester expression vectors; and a target sequence encoding the target protein or peptide.
In another variation of the embodiment involving the yeast two-hybrid system, the steps of expressing the library of tester protein complexes and expressing the target fusion protein includes causing mating between first and second populations of haploid yeast cells of opposite mating types.
The first population of haploid yeast cells comprises a library of tester expression vectors for the library of tester fusion proteins. Each of the tester expression vector comprises a first transcription sequence encoding either the activation domain or the DNA binding domain of the transcription activator, a first nucleotide sequence encoding the first polypeptide subunit, and a second nucleotide sequence encoding the second polypeptide subunit, the first and second nucleotide sequences varying independently within the library of tester expression vectors. The domain encoded by the first transcription sequence and the first polypeptide subunit are expressed as a fusion protein. The first and second polypeptide subunits are expressed as separate proteins, and form the tester protein complex upon binding with each other through non-covalent interactions (e.g. hydrophobic interactions) or covalent interactions (e.g. disulfide bonds).
The second population of haploid yeast cells comprises a target expression vector. The target expression vector comprises a second transcription sequence encoding either the activation domain or the DNA binding domain of the transcription activator which is not expressed by the library of tester expression vectors; and a target sequence encoding the target protein or peptide.
Either the first or second population of haploid yeast cells comprises a reporter construct comprising the reporter gene whose expression is under transcriptional control of the transcription activator.
In this variation, the haploid yeast cells of opposite mating types may preferably be xcex1 and a type strains of yeast. The mating between the first and second populations of haploid yeast cells of xcex1 and a type strains may be conducted in a rich nutritional culture medium.
Optionally, a plurality of target fusion proteins may be expressed and screened against the library of tester proteins at the same time. According to this variation, the population of haploid yeast cells comprising the expression vector encoding a target protein comprises a plurality of expression vectors encoding a plurality of target proteins. Each target protein forms a portion of a fusion protein which also comprises either an activation domain or a DNA binding domain.
According to this variation, members of the library of tester expression vectors may be arrayed as individual yeast clones in one or more multiple-well plates.
Also according to this variation, the plurality of the target expression vectors may be arrayed as individual yeast clones in one or more multiple-well plates.
Also according to this variation, mating may be based on clonal mating in which each yeast clone containing a members of the tester expression vectors is mated individually with each of the plurality of target expression vectors.
Also according to this variation, the plurality of the target expression vectors may be a library of expression vectors containing a collection of human EST clones or a collection of domain structures.
According to any of the above-described methods for selecting protein-protein binding pairs, the target fusion protein comprises an antigen associated with a disease state such as a tumor-surface antigen. Optionally, the target fusion protein may comprise a human growth factor receptor such as epidermal growth factors, transferrin, insulin-like growth factor, transforming growth factors, interleukin-1, and interleukin-2.
In another embodiment, a method is provided for screening protein-DNA binding pairs in a yeast one-hybrid system. The method comprises: expressing a library of tester protein complexes in yeast cells which contain a reporter construct comprising a reporter gene whose expression is under a transcriptional control of a target DNA sequence; and selecting the yeast cells in which the reporter gene is expressed, the expression of the reporter gene being activated by binding of the tester protein complex to the target DNA sequence.
In a variation of the embodiment, the step of expressing the library of tester protein complexes includes transforming into the yeast cells a library of tester expression vectors for the library of tester fusion proteins. Each of the tester expression vectors comprises a transcription sequence encoding an activation domain of a transcription activator, a first nucleotide sequence encoding the first polypeptide subunit, and a second nucleotide sequence encoding the second polypeptide subunit, the first and second nucleotide sequences varying independently within the library of tester expression vectors. The transcriptional activation domain and the first polypeptide subunit are expressed as a fusion protein. The first and second polypeptide subunits are expressed as separate proteins, and form the tester protein complex upon binding with each other through non-covalent interactions (e.g. hydrophobic interactions) or covalent interactions (e.g. disulfide bonds).
In another variation of the embodiment, the step of expressing a library of tester protein complexes in yeast cells includes causing mating between a first and second populations of haploid yeast cells of opposite mating types. The first population of haploid yeast cells comprises a library of tester expression vectors for the library of tester protein complexes described above. The second population of haploid yeast cells comprises the reporter construct.
According to the variation, the haploid yeast cells of opposite mating types may preferably be xcex1 and a type strains of yeast. The mating between the first and second populations of haploid yeast cells of xcex1 and a type strains is preferably conducted in a rich nutritional culture medium.
According to any of the above-described methods for selecting protein-DNA binding pairs, the target DNA sequence in the reporter construct is preferably positioned in 2-6 tandem repeats 5xe2x80x2 relative to the reporter gene.
The target DNA sequence in the reporter construct is preferably between about 15-75 bp in length and more preferably between about 25-55 bp in length.
In yet another embodiment, a method is provided for screening protein-protein binding pairs in a yeast one-hybrid system. The method comprises: expressing a library of tester protein complexes in yeast cells which contain a reporter construct comprising a reporter gene whose expression is under a transcriptional control of a specific DNA binding site; expressing a target protein in the yeast cells expressing the tester protein complexes, where the target protein binds to the specific DNA binding site; and selecting the yeast cells in which the reporter gene is expressed, the expression of the reporter gene being activated by binding of the tester protein complex to the target protein.
In a variation of the embodiment, the step of expressing the library of tester protein complexes includes transforming into the yeast cells a library of tester expression vectors for the library of tester fusion proteins. Each of the tester expression vectors comprises a transcription sequence encoding an activation domain of a transcription activator, a first nucleotide sequence encoding the first polypeptide subunit, and a second nucleotide sequence encoding the second polypeptide subunit, the first and second nucleotide sequences varying independently within the library of tester expression vectors. The transcriptional activation domain and the first polypeptide subunit are expressed as a fusion protein. The first and second polypeptide subunits are expressed as separate proteins, and form the tester protein complex upon binding with each other through non-covalent interactions (e.g. hydrophobic interactions) or covalent interactions (e.g. disulfide bonds).
In another variation of the embodiment, the steps of expressing the library of tester protein complexes and expressing the target fusion protein includes causing mating between a first and second populations of haploid yeast cells of opposite mating types. The first population of haploid yeast cells comprises a library of tester expression vectors for the library of tester protein complexes described above. The second population of haploid yeast cells comprises a target expression vector comprising a target sequence encoding the target protein. Either the first or second population of haploid yeast cells comprises the reporter construct.
In any of the above-described methods for selecting tester proteins capable of binding to a target peptide, protein, or DNA, the method may further comprise isolating the tester expression vectors from the selected yeast cells; and mutagenizing the first and second nucleotide sequences in the isolated tester expression vectors to form a library of mutagenized expression vectors.
Examples of mutagenesis methods include, but are not limited to, error-prone PCR mutagenesis, site-directed mutagenesis, DNA shuffling and combinations thereof. The library of mutagenized expression vectors may be screened against the same or different target peptide, protein or DNA by following similar procedures used for screening the tester expression vectors.
In yet another aspect of the present invention, methods are provided for producing a library of assembled antibodies. Examples of the assembled antibodies include, but are not limited to, a double-chain protein complex (dcFv) formed between the variable regions of the light chain (VL) and heavy chain (VH), the Fab (fragment antigen-binding) fragments, and a fully assembled antibody having both the variable and constant regions of the light chain and heavy chain.
In an embodiment, the method comprises: expressing in cells a library of expression vectors. Each of the expression vectors comprises a first nucleotide sequence encoding a first polypeptide subunit comprising an antibody heavy chain variable region, a second nucleotide sequence encoding a second polypeptide subunit comprising an antibody light chain variable region. The first and second polypeptide subunits are expressed as separate proteins and self assembled to form a dcFv, Fab, or a full antibody upon interacting with each other. Also, the first and second nucleotide sequences each independently varies within the library of expression vectors to generate a library of assembled antibodies with a diversity of at least 107.
According to the embodiment, the diversity of the library of assembled antibodies is preferably between 106-1016, more preferably between 108-1016, and most preferably between 1010-1016.
The cells may be prokaryotic or eukaryotic cells, such as bacteria, yeast, insect, plant and mammalian cells. In a preferred embodiment, the cells where the library of antibodies are expressed are yeast cells.
In yet another aspect of the present invention, a kit is provided for selecting tester proteins capable of binding to a target peptide, protein, or DNA.
In an embodiment, a kit is provided which comprises: a library of tester expression vectors and a yeast cell line. Each of the tester expression vectors comprises a first transcription sequence encoding either an activation domain or a DNA binding domain of a transcription activator, a first nucleotide sequence encoding a first polypeptide subunit, and a second nucleotide sequence encoding a second polypeptide subunit, the first and second nucleotide sequences each independently varying within the library of expression vectors. The first and second polypeptide subunits are expressed as separate proteins and form a protein complex upon interacting with each other. A reporter construct may be contained in the yeast cell line. The reporter construct comprises a reporter gene whose expression is under a transcriptional control of a specific DNA binding site.
Optionally, the kit may further comprise a target expression vector which comprises a second transcription sequence encoding either the activation domain or the DNA binding domain of the transcription activator which is not expressed by the library of tester expression vectors; and a target sequence encoding the target protein or peptide.
In another embodiment, the kit comprises: first and second populations of haploid yeast cells of opposite mating types. The first population of haploid yeast cells comprises a library of tester expression vectors for the library of tester fusion proteins. Each of the tester expression vectors comprises a first transcription sequence encoding either an activation domain or a DNA binding domain of a transcription activator, a first nucleotide sequence encoding a first polypeptide subunit, and a second nucleotide sequence encoding a second polypeptide subunit, the first and second nucleotide sequences each independently varying within the library of expression vectors. The first and second polypeptide subunits are expressed as separate proteins and form a protein complex upon interacting with each other. The second population of haploid yeast cells comprises a target expression vector. The target expression vector encodes either the activation domain or the DNA binding domain of the transcription activator which is not expressed by the library of tester expression vectors; and a target sequence encoding the target protein or peptide. Either the first or second population of haploid yeast cells comprises a reporter construct comprising a reporter gene whose expression is under transcriptional control of the transcription activator.
Optionally, the second population of haploid yeast cells comprises a plurality of target expression vectors. Each of the target expression vectors encodes either the activation domain or the DNA binding domain of the transcription activator which is not expressed by the library of tester expression vectors; and a target sequence encoding the target protein or peptide. Either the first or second population of haploid yeast cells comprises a reporter construct comprising a reporter gene whose expression is under transcriptional control of the transcription activator.
According to any of the above-described compositions, methods and kits, the diversity of the first and/or the second polypeptide subunit encoded by the first and second nucleotide sequences within the library of expression vectors is preferably between 103-108, more preferably between 104-108, and most preferably between 105-108.
Also according to any of the above-described compositions, methods and kits, the diversity of the protein complexes encoded by the library of expression vectors may be preferably at least 106-1018, more preferably at least 109-1018, and most preferably at least 1010-1018.
Also according to any of the above-described compositions, methods and kits, the diversities of the first and second polypeptide subunits may be each independently derived from libraries of precursor sequences that are not specifically designed for the target peptide, protein or DNA.
Also according to any of the above-described compositions, methods and kits, the diversities of the first and second polypeptide subunits optionally are not derived from one or more proteins that are known to bind to the target peptide, protein or DNA.
Also according to any of the above-described compositions, methods and kits, the diversities of the first and second polypeptide subunits optionally are not generated by mutagenizing one or more proteins that are known to bind to the target peptide, protein or DNA.
Also according to any of the above-described compositions, methods and kits, the first and the second polypeptide subunits may be subunits of a multimeric protein whose sequence varies within a library of multimeric proteins. Examples of multimeric proteins include, but are not limited to, growth factor receptors, T cell receptors, cytokine receptors, tyrosine kinase-associated receptors, and MHC proteins.
Also according to any of the above-described compositions, methods and kits, the first nucleotide sequence in the library of expression vectors comprises a coding sequence of an antibody heavy-chain variable region (VH) or an antibody heavy-chain including both the variable and constant regions (VH+CH, CH including CH1, CH2, and CH3). The second nucleotide sequence comprises a coding sequence of an antibody light-chain variable region (VL) or an antibody light-chain including both the variable and constant region (VL+CL).
Alternatively, the first nucleotide sequence in the library of expression vectors comprises a coding sequence of an antibody light-chain variable region (VL) or an antibody light-chain including both the variable and constant region (VL+CL). The second nucleotide sequence comprises a coding sequence of an antibody heavy-chain variable region (VH) or an antibody heavy-chain including both the variable and constant regions (VH+CH, CH including CH1, CH2, and CH3).
The source of the coding sequences of the antibody light-chain and heavy-chain variable and constant regions is preferably from human, non-human primate, or rodent. Optionally, the source of the coding sequences of the antibody light-chain and heavy-chain variable and constant regions may be from one or more non-immunized animals. Preferably, the source of the coding sequences of the antibody light-chain and heavy-chain variable and constant regions may be from human fetal spleen, lymph nodes or peripheral blood cells.
Also according to any of the above-described compositions, methods and kits, the first and second polypeptide subunits may each further comprise a plurality of cysteine residues, preferably 2-8 Cys residues, at or adjacent the N- or C-terminus of the polypeptide. It is believed that by adding more cysteine subunits near the termini of the subunits, the intermolecular interactions between the two subunits should be enhanced through formation of Cys-Cys disulfide bonds, thus further stabilizing the assembly of the protein complex formed by the two subunits.
Alternatively, the first and second polypeptide subunits may each further comprise a xe2x80x9czipperxe2x80x9d domain at or adjacent the N- or C-terminus of the polypeptide. As used herein, a xe2x80x9czipper domainxe2x80x9d refers to a protein or peptide structural motif that can interact with another xe2x80x9czipper domainxe2x80x9d with a different sequence to form a hetero-polymer such as a heterodimer. It is believed that by adding a zipper domain near the termini of the subunits, the intermolecular interactions between the two subunits should be enhanced through non-covalent interactions (e.g. hydrophobic interactions), thus further stabilizing the assembly of the protein complex formed by the two subunits.
In addition, the first or the second polypeptide subunit may further comprise a xe2x80x9cbundlexe2x80x9d domain at or adjacent the C-terminus of the polypeptide. As used herein, a xe2x80x9cbundle domainxe2x80x9d refers to a protein or peptide structural motif that can interact with itself to form a homo-polymer such as a homopentamer. The bundle domains bring the protein complex together by polymerization through non-covalent interactions such as coiled-coil interactions. It is believed that polymerization of the protein complex should enhance the avidity of the protein complexes to their binding target through multivalent binding. For example, avidity of antibody of the present invention may be dramatically increased by fusing a bundle domain (e.g. the coiled-coil domain of the cartilage oligomeric matrix protein) to the C-terminus of the heavy chain via a semi-rigid linker.
Also, the first or second polypeptide subunit may further comprise a signaling domain for screening the library of the protein complexes based non-conventional two-hybrid methods such as the SRS (Sos recruitment system) and RRS (Ras Recruitment System). Examples of such signaling domain includes but are not limited to a Ras guanyl nucleotide exchange factor (e.g. human SOS factor), a membrane targeting signal such as a myristoylation sequence and farnesylation sequence, mammalian Ras lacking the carboxy-terminal domain (the CAAX box), and a ubiquitin sequence.
Also according to any of the above-described compositions, methods and kits, each of the expression vectors may further comprise a sequence encoding an affinity tag. Examples of affinity tags include, but are not limited to, polyhistidine tags, polyarginine tags, glutathione-S-transferase, maltose binding protein, staphylococcal protein A tag, and EE-epitope tags.
Also according to any of the above-described compositions, methods and kits, the transcription activator may be any transcription activator having separable DNA-binding and transcriptional activation domains. Examples of transcription activators include, but are not limited to, GAL4, GCN4, and ADR1 transcription activators.
Also according to any of the above-described compositions, methods and kits, the reporter protein encoded by the reporter gene may be any reporter genes whose expression shows a distinct genotype or phenotype in a cell. Examples of such a reporter protein include, but are not limited to, xcex2-galactosidase, xcex1-galactosidase, luciferase, xcex2-glucuronidase, chloramphenicol acetyl transferase, secreted embryonic alkaline phosphatase, green fluorescent protein, enhanced blue fluorescent protein, enhanced yellow fluorescent protein, and enhanced cyan fluorescent protein.