The IgG isotype is the most abundant immunoglobulin found in sera. In all mammals, it is composed of two identical heavy (H) chains and two identical light (L) chains. Immunoglobulins harbouring this structure are therefore designated four-chain immunoglobulins. The H-chain of a 4-chain immunoglobulin contains 4 domains and a hinge region in between the second and third domain. The L-chain has two domains. The N-terminal domains of both the L- and H-chain are more variable in sequence than the remaining domains and are known as V-domains (VL and VH respectively). Three loops within the VH and three loops within the VL juxtapose in the paired VH-VL domains and constitute the antigen-binding site. The loops are hypervariable in sequence and named CDR for Complementarity Determining Region. A description of the general structure of a 4-chain Immunoglobulin is provided in “Immunology” Roitt I. et al., Ed. MEDSI/McGRAW-HILL.
Much of the antigen binding diversity and the success of antibodies to generate a tight antigen binder against virtually all possible foreign substances, comes from the random pairing of one out of thousands of possible VHs with one out of thousands possible VLs. The second domain of the L-chain, having a more conserved sequence and denoted CL, is associated with the second domain of the H-chain (CH1) that has also a conserved sequence.
A pathological disorder in humans, known as heavy chain disease, is characterised by the presence of antibodies in the serum that do not contain L-chains. Moreover, these antibodies lack important parts of their VH and CH1 as well, although the missing VH and CH1 regions can vary widely among different HCAb (Heavy Chain Antibody). The deletions in the H-chain are due to deletions of the rearranged H-chain involving part of the VH and the CH1 domain. These antibodies no longer recognise antigen since the VL is absent and large parts of the VH is absent too. The HCAb can be secreted from the B-cells because the chaperone proteins (such as BIP) that associate with the CH1 retain the H chain in the endoplasmic reticulum until BIP is replaced by the L-chain. In absence of the CH1 polypeptide domain, the BIP can no longer retain the truncated H-chain in the endoplasmic reticulum, and the L-chain cannot bind either resulting in the fact that the H-chains are immediately secreted as homodimers.
Similar non-functional HCAbs were also reported to emerge in mouse monoclonal cell lines.
In sera of Camelidae (camels, dromedaries and llamas) we found the presence of the 4-chain immunoglobins and, in addition, of large amounts of functional HCAbs. These functional HCAbs have been described in European Patent Application No. 0656946 and in various publications including Hamers-Casterman et al. (1993), Vu et al. (1997) and Muyldermans et al. (2001). They are distinct from the human/mouse HCAbs present as a result of the pathological stage, in several respects. Firstly, they are functional in antigen binding. In this respect the HCAbs found in Camelidae are functional normal immunoglobulins. Secondly, in Camelidae, the entire CH1 domain is missing, and the V domain is intact but HCAbs have a sequence that deviates at a few sites from normal VH sequences. Said functional HCAb occur as a homodimeric molecules.
The CH1 is however encoded in the germline of all γ-genes in dromedaries (and llama) and is removed from the mRNA coding for the functional HCAbs by a splicing of the 3′ end of the V-exon with the 5′ end of the hinge exon. Thus, the CH1 is part of the intron and is no longer recognized as an exon because of a single point mutation of the consensus splicing signal sequence. Llama and dromedary carry the same point mutation at the former CH1 exon and this finding indicates that these γ-genes emerged before the llama and camels diverged from each other. The different splicing activity of the mRNA is not an alternative splicing as all mRNA is spliced according to this scheme. Hence these γ-genes will always lead to a H-chain with its CH1 removed. Other γ-genes are used to produce the common H-chain with a CH1 domain.
The V-domain of the H-chain of functional HCAb (referred to as VHH, for Variable domain of the H-chain of a normal, i.e. immunologically-active HCAb) is expected to acquire adaptations versus the VH (i.e. V-domain of H-chain of conventional four-chain antibody) in the regions that are no longer contacting the VL (or the CH1) domain and in those participating in antigen binding (i.e. the paratope).
For instance, Chothia et al. (1985) have indicated in the above-referenced publication that crystallographic data revealed that conserved Val 37, Gly 44, Leu 45 and Trp 47 are clustered in space in a conventional 4-chain IgG and make important hydrophobic contacts with the VL. They added that the VH amino acids Gln 39, Tyr 91, Trp 103 and Glu 105 are also recognized as important for VL association. Desmyter et al. (1996) further observed that the surface of the VHH domain which is present in camelidae and which corresponds to the VH side of conventional IgG which interacts with a VL is significantly reshaped in the camelid VHHs. In the present invention, the numbering of the amino acid residues is given by reference to the Kabat numbering (Kabat E, 1991) which is used in accordance with the Kabat database available, for example at the website of Dr. A.C.R. Martin's Bioinformatics Group at University College London bioinf.org.uk/abs.
The most frequently occurring amino acid residues at twelve VH locations known to interact with VL have been determined for 332 vertebrate VH segments. It is mentioned that for the purpose of the present invention, the protein domain of the variable heavy polypeptide chain is referred as “VH” and the corresponding DNA is designated VH-D-J as it is assembled from a VH germline, a diversity D minigene and a J minigene. In fact the CDR3 and FR4 are not encoded by the VH, but they are provided by D and J minigene that are recombined with the VH or (VHH) germline.
For comparison, the amino acid consensus has been deduced for 42 dromedary germline VHH sequences at the corresponding locations. The preferred amino acid residues at four positions (39, 43, 60 and 91, Kabat numbering) is invariable in VH and VHH. In contrast, at four other sites (33, 35, 50 and 58) neither VH nor VHH sequences reveal a pronounced amino acid preference. At the latter VH sites, the possible contact with the VL is dependent on the actual angle between VH and VL domains, and this explains the observed amino acid degeneracy. The only crucial differences between VH and VHH proteins in this area concern position 37, 44, 45 and 47. These are highly conserved amino acid residues among VH phenotypes (i.e. Val37, Gly44, Leu45 and Trp47), but in the VHH, the inventors observed most frequently Phe37 (or Tyr), Glu44, Arg45 (or Cys), and Gly47 (or Leu). These comparisons substantiate previous identifications of camel VHH-specific “hallmark” residues that arise in response to the absence of the L-chain.
From the results published by Nguyen et al. (2000), it is apparent that VHH and VH genes are imprinted in the dromedary genome. The VH and VHH genes are most likely residing in the same locus. It was noticed that the VH and VHH germline genes use the same D and Jgenes with the H-chain of conventional 4-chain antibodies. By PCR, around 50 VH and around 40 VHH germline genes were identified in dromedary. Each PCR fragment contains a leader signal exon and a V-exon, that ends where the CDR3 should start. The CDR3 and FR4 are provided by the recombined D-J segments. The VH germline segment harbours codons for Val37, Gly44, Leu45 and Trp47, and the VHH germline minigenes possess the Phe37 (11×) Tyr37 (30×) or in one single case Val37; Glu44 or Gln44 (8×); Arg45 (37×) or Cys45 (5×) and Gly47 (6×) or Leu47 (24×) or Trp47 (8×) or Phe47 (3×). In addition, these VHH germline-genes contain always (except 1) a Cys codon at position 45 or at the CDR1 region (codon 30, 32 or 33). Based on the length of the CDR2 (16 or 17 amino acids in size) and the location of the extra Cys, the VHH germline segments were grouped in subfamilies. Some subfamilies had several members while others are much scarcer in the genome. However, it should be noted that the frequency of occurrence of these VHH germline genes in expressed HCAb is not at all related to their frequency of occurrence in the genome. The Cys at position 45 or around the CDR1 is normally maintained in the rearranged VHH-D-J segments, and these rearrangements products have also acquired an extra Cys in the CDR3. Likewise, VHH-D-J rearrangements that were unable to generate an extra Cys in their CDR3 will apparently knock out the Cys45 or Cys in the CDR1 region probably by somatic hypermutation or by B-cell receptor editing. B cell receptor editing is a mechanism by which an upstream unrearranged V-segment is recombined into an existing V-D-J recombination product, that was most likely not functional, or recognizing a self antigen.
For dromedary, the VHH domains carry also longer CDR3 than that of the VH domains (average length 17-18 versus 9). Three possibilities can be envisaged to generate a longer CDR3. The VHH may uses two or more D minigenes, however, this is unlikely in view of the necessity to recombine two minigenes with a different recombination signal sequence (the 12-23 spacer rule). Alternatively, a more active terminal deoxynucleotidyl transferase during the D-J or V-D-J recombination might add several non-template encoded nucleotides. Finally, it can not be excluded that the length difference is only due to selection in which the fraction of VHH domains with long CDR3 or the VH domains with short CDR3 is much more likely to become functional to interact with antigen. A combination of the two latter explanations might also be relevant.
It has been proposed repeatedly that the presence of the VHH hallmarks at positions 37, 44, 45 or 47 or the substitution of the VH into the VHH hallmarks can lead to the formation of soluble single-domain antibody fragment. Of these, the amino acid at position 45 was considered crucial as the substitution of Leu45 of a human VH domain by Arg45 rendered the isolated domain more soluble. This camelised human VH adopts a properly folded immunoglobulin structure (Riechman, 1996. Rearrangement of the former VL interface in the solution structure of a camelised, single domain VH antibody).
However, work of Chothia et al. (1985) revealed that amino acids of VH at position 35, 37, 39, 44, 45, 47, 91, 93 encoded by the VH gene segment, 95, 100, 101 as part of the CDR3, and 103, 105 encoded by the J gene segment are the key participants for the VL interaction. Of these, amino acids 37, 45, 47 differ largely between VH and VHH. Position 103 is occupied by a conserved Trp that is well buried in the VH-VL complex and provides the largest contact surface area with the VL after Leu45 and Trp47 (FIG. 2 in Chothia et al.). As this Trp103 is encoded by the J gene and as the J gene is used in common in the VH-D-J and VHH-D-J recombination, it is logical to expect Trp at position 103 in VHH's as well. Since the VH-VL association is mediated by hydrophobic interactions, it is also clear that the substitution of the large aromatic and hydrophobic Trp 103 residue by the charged and hydrophilic Arg will prevent the association with a VL, and that of the surrogate light chain as well. WO92/01787 claims a single chain variable domain, being a synthetic variable immunoglobulin heavy chain domain, in which one or more of the amino acid residues at position 37, 39, 45, 47, 91, 93 or 103 is altered, whereby the tryptophan at position 103 is changed into glutamate, tyrosine or threonine. However, there is no indication that a substitution of tryptophan at 103 alone by arginine, glycine, lysine, proline or serine would be sufficient to obtain a functional heavy chain antibody, neither that this mutation could compensate for the absence of a charged amino acid or a cysteine at position 45, nor that said mutation may result in an increased solubility of a single domain heavy chain antibody fragment.
It is known in the art that the production of antibodies, for example by bacterial overexpression techniques, by phage display libraries, is technically difficult due to the antibody or fragments thereof being poorly expressed, insoluble, mis-folded. It is also known that the screening of antibody libraries is restricted to those which are soluble, so excluding a large fraction of antibodies with potentially active antigen binding regions. Thus binders which might be therapeutically useful would be precluded from screening. There is a need by researchers involved in discovering new therapeutic agents for a method for producing functional antibodies and fragments thereof. There is a need by researchers involved in discovering new therapeutic agents for antibody libraries comprising functional antibodies. There is a need by researchers involved in discovering new therapeutic agents for methods to functionalise antibodies.