The present invention relates to methods for selecting functional members of a repertoire of polypeptides using generic and target ligands. In particular, the invention describes a method for isolating a functional subset of a repertoire of antibody polypeptides with a generic ligand.
The antigen binding domain of an antibody comprises two separate regions: a heavy chain variable domain (VH) and a light chain variable domain (VL, which can be either Vxcexa or Vxcex). The antigen binding site itself is formed by six polypeptide loops: three from VH domain (H1, H2 and H3) and three from VL domain (L1, L2 and L3). A diverse primary repertoire of V genes that encode the VH and VL domains is produced by the combinatorial rearrangement of gene segments. The VH gene is produced by the recombination of three gene segments, VH, D and JH. In humans, there are approximately 51 functional VH segments (Cook and Tomlinson, 1995, Immunol. Today, 16: 237-242), 25 functional D segments (Corbett et al., 1997, J. Mol. Biol., 268: 69) and 6 functional JH segments (Ravetch et al., 1981, Cell, 27: 583-591), depending on the haplotype. The VH segment encodes the region of the polypeptide chain which forms the first and second antigen binding loops of the VH domain (H1 and H2), while the VH, D and JH segments combine to form the third antigen binding loop of the VH domain (H3). The VL gene is produced by the recombination of only two gene segments, VL and JL. In humans, there are approximately 40 functional Vxcexa segments (Schxc3xa4ble and Zachau, 1993, Biol. Chem. Hoppe-Seyler, 374: 1001-1022), 31 functional Vxcex segments (Williams et al., 1996, J. Mol. Biol., 264: 220-232; Kawasaki et al., 1997, Genome Res., 7: 250-261), 5 functional Jxcexa segments (Hieter et al., 1982, J. Biol. Chem., 257: 1516) and 4 functional Jxcex segments (Vasicek and Leder, 1990, J. Exp. Med., 172: 609-620), depending on the haplotype. The VL segment encodes the region of the polypeptide chain which forms the first and second antigen binding loops of the VL domain (L1 and L2), while the VL and JL segments combine to form the third antigen binding loop of the VL domain (L3). Antibodies selected from this primary repertoire are believed to be sufficiently diverse to bind almost all antigens with at least moderate affinity. High affinity antibodies are produced by xe2x80x9caffinity maturationxe2x80x9d of the rearranged genes, in which point mutations are generated and selected by the immune system on the basis of improved binding.
Analysis of the structures and sequences of antibodies has shown that five of the six antigen binding loops (H1, H2, L1, L2, L3) possess a limited number of main-chain conformations or canonical structures (Chothia and Lesk, 1987, J. Mol. Biol., 196: 901-917; Chothia et al., 1989, Nature, 342: 877-883). The main-chain conformations are determined by (i) the length of the antigen binding loop, and (ii) particular residues, or types of residue, at certain key position in the antigen binding loop and the antibody framework. Analysis of the loop lengths and key residues has enabled us to the predict the main-chain conformations of H1, H2, L1, L2 and L3 encoded by the majority of human antibody sequences (Chothia et al., 1992, J. Mol. Biol., 227: 799-817; Tomlinson et al., 1995, EMBO J., 14: 4628-4638; Williams et al., 1996, supra). Although the H3 region is much more diverse in terms of sequence, length and structure (due to the use of D segments), it also forms a limited number of main-chain conformations for short loop lengths which depend on the length and the presence of particular residues, or types of residue, at key positions in the loop and the antibody framework (Martin et al., 1996, J. Mol. Biol.,263: 800-815; Shirai et al., 1996, FEBS Letters, 399: 1-8).
A similar analysis of side-chain diversity in human antibody sequences has enabled the separation of the pattern of sequence diversity in the primary repertoire from that created by somatic hypermutation. The two patterns are complementary: diversity in the primary repertoire is focused at the center of the antigen binding whereas somatic hypermutation spreads diversity to regions at the periphery that are highly conserved in the primary repertoire (Tomlinson et al., 1996, J. Mol. Biol, 256: 813-817; Ignatovich et al., 1997, J. Mol. Biol., 268: 69-77). This complementarity seems to have evolved as an efficient strategy for searching sequence space, given the limited number of B cells available for selection an a given time. Thus, antibodies are first selected from the primary repertoire based on diversity at the centre of the binding site. Somatic hypermutation is then left to optimize residues at the periphery without disrupting favorable interactions established during the primary response.
The recent advent of phage-display technology (Smith, 1985, Science, 228: 1315-1317; Scott and Smith, 1990, Science, 249: 386-390; McCafferty et al., 1990, Nature, 348: 552-554) has enabled the in vitro selection of human antibodies against a wide range of target antigens from xe2x80x9csingle potxe2x80x9d libraries. These phage-antibody libraries can be grouped into two categories: natural libraries which use rearranged V genes harvested from human B cells (Marks et al., 1991, J. Mol. Biol., 222: 581-597; Vaughan et al., 1996, Nature Biotech., 14: 309) or synthetic libraries whereby germline V gene segments are xe2x80x98rearrangedxe2x80x99 in vitro (Hoogenboom and Winter, 1992, J. Mol. Biol., 227: 381-388; Nissim et al., 1994, EMBO J., 13: 692-698; Griffiths et al., 1994, EMBO J., 13: 3245-3260; De Kruif et al., 1995, J. Mol. Biol., 248: 97) or where synthetic CDRs are incorporated into a single rearranged V gene (Barbas et al., 1992, Proc. Natl. Acad. Sci. USA, 89: 4457-4461). Although synthetic libraries help to overcome the inherent biases of the natural repertoire which can limit the effective size of phage libraries constructed from rearranged V genes, they require the use of long degenerate PCR primers which frequently introduce base-pair deletions into the assembled V genes. This high degree of randomization may also lead to the creation of antibodies which are unable to fold correctly and are also therefore non-functional. Furthermore, antibodies selected from these libraries may be poorly expressed and, in many cases, will contain framework mutations that may effect the antibodies immunogenicity when used in human therapy.
Recently, in an extension of the synthetic library approach it has been suggested (WO97/08320, Morphosys) that human antibody frameworks can be pre-optimized by synthesizing a set of xe2x80x98master genesxe2x80x99 that have consensus framework sequences and incorporate amino acid substitutions shown to improve folding and expression. Diversity in the CDRs is then incorporated using oligonucleotides. Since it is desirable to produce artificial human antibodies which will not be recognized as foreign by the human immune system, the use of consensus frameworks which, in most cases, do not correspond to any natural framework is a disadvantage of this approach. Furthermore, since it is likely that the CDR diversity will also have an effect on folding and/or expression, it would be preferable to optimize the folding and/or expression (and remove any frame-shifts or stop codons) after the V gene has been fully assembled. To this end, it would be desirable to have a selection system which could eliminate non-functional or poorly folded/expressed members of the library before selection with the target antigen is carried out.
A further problem with the libraries of the prior art is that, because the main-chain conformation is heterogeneous, three-dimensional structural modeling is difficult because suitable high resolution crystallographic data may not be available. This is a particular problem for the H3 region, where the vast majority of antibodies derived from natural or synthetic have medium length or long loops and therefore cannot be modeled.
According to the first aspect of the present invention, there is provided a method for selecting, from a repertoire of polypeptides, a population of functional polypeptides which bind a target ligand in a first binding site and a generic ligand in a second binding site, which generic ligand is capable of binding functional members of the repertoire regardless of target ligand specificity, comprising the steps of:
a) contacting the repertoire with the generic ligand and selecting functional polypeptides bound thereto; and
b) contacting the selected functional polypeptides with the target ligand and selecting a population of polypeptides which bind to the target ligand.
The invention accordingly provides a method by which a repertoire of polypeptides is preselected, according to functionality as determined by the ability to bind the generic ligand, and the subset of polypeptides obtained as a result of preselection is then employed for further rounds of selection according to the ability to bind the target ligand. Although, in a preferred embodiment, the repertoire is first selected with the generic ligand, it will be apparent to one skilled in the art that the repertoire may be contacted with the ligands in the opposite order, i.e. with the target ligand before the generic ligand.
The invention permits the person skilled in the art to remove, from a chosen repertoire of polypeptides, those polypeptides which are non-functional, for example as a result of the introduction of frame-shift mutations, stop codons, folding mutants or expression mutants which are incapable of binding to a target ligand. Such non-functional mutants are generated by the normal randomization and variation procedures employed in the construction of polypeptide repertoires. At the same time the invention permits the person skilled in the art to enrich a chosen repertoire of polypeptides for those polypeptides which are functional, well folded and highly expressed.
Preferably, two or more subsets of polypeptides are obtained from a repertoire by the method of the invention, for example, by prescreening the repertoire with two or more generic ligands, or by contacting the repertoire with the generic ligand(s) under different conditions. Advantageously, the subsets of polypeptides thus obtained are combined to form a further repertoire of polypeptides, which may be further screened by contacting with target and/or generic ligands.
Preferably, the library according to the invention comprises polypeptides of the immunoglobulin superfamily, such as antibody polypeptides or T-cell receptor polypeptides. Advantageously, the library may comprise individual immunoglobulin domains, such as the VH or VL domains of antibodies, or the Vxcex2 or Vxcex1 domains of T-cell receptors. In a preferred embodiment, therefore, repertoires of, for example, VH and VL polypeptides may be individually prescreened using a generic ligand and then combined to produce a functional repertoire comprising both VH and VL polypeptides. Such a repertoire can then be screened with a target ligand in order to isolate polypeptides comprising both VH and VL domains and having the desired binding specificity.
In an advantageous embodiment, the generic ligand selected for use with immunoglobulin repertoires is a superantigen. Superantigens are able to bind to functional immunoglobulin molecules, or subsets thereof comprising particular main-chain conformations, irrespective of target ligand specificity. Alternatively, generic ligands may be selected from any ligand capable of binding to the general structure of the polypeptides which make up any given repertoire, such as antibodies themselves, metal ion matrices, organic compounds including proteins or peptides, and the like.
In a second aspect, the invention provides a library wherein the functional members have binding sites for both generic and target ligands. Libraries may be specifically designed for this purpose, for example by constructing antibody libraries having a main-chain conformation which is recognized by a given superantigen, or by constructing a library in which substantially all potentially functional members possess a structure recognizable by a antibody ligand.
In a third aspect, the invention provides a method for detecting, immobilizing, purifying or immunoprecipitating one or more members of a repertoire of polypeptides previously selected according to the invention, comprising binding the members to the generic ligand.
In a fourth aspect, the invention provides a library comprising a repertoire of polypeptides of the immunoglobulin superfamily, wherein the members of the repertoire have a known main-chain conformation.
In a fifth aspect, the invention provides a method for selecting a polypeptide having a desired generic and/or target ligand binding site from a repertoire of polypeptides, comprising the steps of:
a) expressing a library according to the preceding aspects of the invention;
b) contacting the polypeptides with generic and/or target ligands and selecting those which bind the generic and/or target ligand; and
c) optionally amplifying the selected polypeptide(s) which bind the generic and/or target ligand.
d) optionally repeating steps a)-c).
Repertoires of polypeptides are advantageously both generated and maintained in the form of a nucleic acid library. Therefore, in a sixth aspect, the invention provides a nucleic acid library encoding a repertoire of such polypeptides.
As used herein, the term xe2x80x9crepertoirexe2x80x9d refers to a heterogeneous population of molecules, for example nucleic acid molecules which vary in nucleotide sequence or polypeptide molecules which vary in amino acid sequence. A library according to the invention will encompass a repertoire of polypeptides or nucleic acids. According to the present invention, a repertoire of polypeptides is designed to possess a binding site for a generic ligand and a binding site for a target ligand. The binding sites may overlap, or be located in the same region of the molecule, but their specificities will differ.
As used herein with regard to the generic ligand, the term xe2x80x9cselectxe2x80x9d refers to binding of the generic ligand to its binding site on library members, to exclude non-functional library members (such as those bearing frameshift-, stop-codon-(nonsense) or substitution (missense) mutations, or any other mutations which render them unable to fold properly and/or or bind the target ligand. Similarly, when used with regard to the target ligand, the term xe2x80x9cselectxe2x80x9d refers to binding of the generic ligand to polypeptides which possess a functional target-ligand-binding site, such that non-functional library members are lost. Together, retention of library members which bind the generic and target ligands is referred to as selection from a functional polypeptide repertoire of a subset of repertoire members.
As used herein, the term xe2x80x9corganismxe2x80x9d refers to all cellular life-forms, such as prokaryotes and eukaryotes, as well as non-cellular, nucleic acid-containing entities, such as bacteriophage and viruses.
As used herein, the term xe2x80x9cfunctionalxe2x80x9d refers to a polypeptide which possesses either the native biological activity of the naturally-produced proteins of its type, or any specific desired activity, for example as judged by its ability to bind to ligand molecules, defined below. Examples of xe2x80x9cfunctionalxe2x80x9d polypeptides include an antibody binding specifically to an antigen through its antigen-binding site, a receptor molecule (e.g. a T-cell receptor) binding its characteristic ligand and an enzyme binding to its substrate. In order for a polypeptide to be classified as functional according to the invention, it follows that it first must be properly processed and folded so as to retain its overall structural integrity, as judged by its ability to bind the generic ligand, also defined below.
For the avoidance of doubt, functionality is not equivalent to the ability to bind the target ligand. For instance, a functional anti-CEA monoclonal antibody will not be able to bind specifically to target ligands such as bacterial LPS. However, because it is capable of binding a target ligand (i.e. it would be able bind to CEA if CEA were the target ligand) it is classed as a xe2x80x9cfunctionalxe2x80x9d antibody molecule and may be selected by binding to a generic ligand, as defined below. Typically, non-functional antibody molecules will be incapable of binding to any target ligand.
As used herein, the term xe2x80x9cgeneric ligandxe2x80x9d refers to a ligand that binds a substantial proportion of functional members in a given repertoire. Thus, the same generic ligand can bind many members of the repertoire regardless of their target ligand specificities (see below). In general, the presence of functional generic ligand binding site indicates that the repertoire member is expressed and folded correctly. Thus, binding of the generic ligand to its binding site provides a method for preselecting functional polypeptides from a repertoire of polypeptides.
As used herein, the term xe2x80x9ctarget ligandxe2x80x9d refers to a ligand for which a specific binding member or members of the repertoire is to be identified and/or isolated. Where the members of the repertoire are antibody molecules, the target ligand may be an antigen and where the members of the repertoire are enzymes, the target ligand may be a substrate. Binding to the target ligand is dependent upon both the member of the repertoire being functional, as described above under generic ligand, and upon the precise specificity of the binding site for the target ligand.
As used herein, the term xe2x80x9csubsetxe2x80x9d refers to a part of a repertoire. In the terms of the present invention, it is often the case that only a subset of the repertoire is functional and therefore possesses a functional generic ligand binding site. Furthermore, it is also possible that only a fraction of the functional members of a repertoire (yet significantly more than would bind a given target ligand) will bind the generic ligand. These subsets are able to be selected according to the invention. Subsets of a library may be combined or pooled to produce novel repertoires which have been preselected according to desired criteria. Combined or pooled repertoires may be simple mixtures of the polypeptide members preselected by generic ligand binding, or may be manipulated to combine two polypeptide subsets. For example, VH and VL polypeptides may be individually prescreened, and subsequently combined at the genetic level onto single vectors such that they are expressed as combined VH-VL dimers, such as scFv.
As used herein, the term xe2x80x9clibraryxe2x80x9d refers to a mixture of heterogeneous polypeptides or nucleic acids. A library is composed of members, which have a single polypeptide or nucleic acid sequence. To this extent, xe2x80x9clibraryxe2x80x9d is synonymous with xe2x80x9crepertoirexe2x80x9d. Sequence differences between library members are responsible for the diversity present in the library. The library may take the form of a simple mixture of polypeptides or nucleic acids, or may be in the form organisms or cells, for example bacteria, viruses, animal or plant cells and the like, transformed with a library of nucleic acids. Preferably, each individual organism or cell contains only one member of the library. Advantageously, the nucleic acids are incorporated into expression vectors, in order to allow expression of the polypeptides encoded by the nucleic acids. In a preferred aspect, therefore, a library may take the form of a population of host organisms, each organism containing one or more copies of an expression vector containing a single member of the library in nucleic acid form which can be expressed to produce its corresponding polypeptide member. Thus, the population of host organisms has the potential to encode a large repertoire of genetically diverse polypeptide variants.
As used herein, the term xe2x80x9cimmunoglobulin superfamilyxe2x80x9d refers to a family of polypeptides which retain the immunoglobulin fold characteristic of immunoglobulin (antibody) molecules, which contains two xcex2 sheets and, usually, a conserved disulfide bond. Members of the immunoglobulin superfamily are involved in many aspects of cellular and non-cellular interactions in vivo, including widespread roles in the immune system (for example, antibodies, T-cell receptor molecules and the like), involvement in cell adhesion (for example the ICAM molecules) and intracellular signaling (for example, receptor molecules, such as the PDGF receptor). The present invention is applicable to all immunoglobulin superfamily molecules, since variation therein is achieved in similar ways. Preferably, the present invention relates to immunoglobulins (antibodies).
As used herein, the term xe2x80x9cmain-chain conformationxe2x80x9d refers to the Ca backbone trace of a structure in three-dimensions. When individual hypervariable loops of antibodies or TCR molecules are considered the main-chain conformation is synonymous with the canonical structure. As set forth in Chothia and Lesk, 1987 J. Mol. Biol., 196: 901 and Chothia et al., 1989 Nature, 342: 877, antibodies display a limited number of canonical structures for five of their six hypervariable loops (H1, H2, L1, L2 and L3), despite considerable side-chain diversity in the loops themselves. The precise canonical structure exhibited depends on the length of the loop and the identity of certain key residues involved in its packing. The sixth loop (H3) is much more diverse in both length and sequence and therefore only exhibits canonical structures for certain short loop lengths (Martin et al., 1996 J. Mol. Biol., 263: 800; Shirai et al., 1996 FEBS Letters, 399: 1). In the present invention, all six loops will preferably have canonical structures and hence the main-chain conformation for the entire antibody molecule will be known.
As used herein, the term xe2x80x9cantibodyxe2x80x9d refers to an immunoglobulin that is produced by a B cell and which forms a central part of the host immune defense system in vertebrates. An xe2x80x9cantibody polypeptidexe2x80x9d, as used herein, is a polypeptide which either is an antibody or is a part of an antibody, modified or unmodified. Thus, the term antibody polypeptide includes a heavy chain, a light chain, a heavy chain-light chain dimer, a Fab fragment, a F(abxe2x80x2)2 fragment, a Dab fragment, or an Fv fragment, including a single chain Fv (scFv). Methods for the construction of such antibody molecules are well known in the art.
As used herein, the term xe2x80x9csuperantigenxe2x80x9d refers to an antigen, typically in the form of a toxin expressed in bacteria, which interacts with members of the immunoglobulin superfamily outside the conventional ligand binding sites for these molecules. Staphylococcal enterotoxins interact with T-cell receptors and have the effect of stimulating CD4+ T-cells. Superantigens for antibodies include the molecules Protein G that binds the IgG constant region (Bjorck and Kronvall, 1984, J. Immunol, 133: 969; Reis et al., 1984, J. Immunol., 132: 3091), Protein A that binds the IgG constant region and the VH domain (Forsgren and Sjoquist, 1966, J. Immunol., 97: 822) and Protein L that binds the VL domain (Bjorck, 1988, J. Immunol., 140: 1994).
As used herein, the term xe2x80x9cdetectingxe2x80x9d refers to binding to a member polypeptide of a repertoire a labeled probe comprising a generic- and/or target ligand of the invention, and then performing a step to determine the presence or absence of label, the former of which is indicative of the presence in a sample of the member polypeptide of interest.
As used herein, the term xe2x80x9cimmobilizingxe2x80x9d refers to indirect binding a member polypeptide of a repertoire of polypeptides to a solid or semi-solid support, via binding the target- and/or generic ligand of the invention, either of which specifically binds the polypeptide member and is, itself, linked to the support.
As used herein, the term xe2x80x9cpurifyingxe2x80x9d refers to the isolation of a polypeptide from a repertoire of polypeptides according to the invention away from polypeptides of unlike sequence. Such purifying may be accomplished, for example, via immunoprecipitation or affinity chromatography, using one or both of the target and generic ligands.
As used herein, the term xe2x80x9cimmunoprecipitatingxe2x80x9d refers to contacting a repertoire of polypeptide molecules with an antibody under conditions which permit the formation of specific antibody:antigen complexes and under which such complexes precipitate from solution, collecting the precipitate and resuspending the precipitated proteins in a buffer which permits dissociation of the antibody from the antigen. In the case of the present invention, immunoprecipitation may be performed using an a generic- and/or a target ligand which is an antibody.