The invention relates to a procedure for specific replacement of a copy of a gene present in the genome of a recipient eucaryotic organism by the integration of a gene different from the inactivated gene. Preferably, the recipient gene will be present in at least 2 copies in the transfected host cell. The recipient gene is defined as being the gene where the insertion of the different gene is made.
More particularly, the invention relates to the production of transgenic animals in which the foreign gene has been introduced in a targetted manner in order to make possible both the maintainance of the normal genetic functions of the animal and the expression of the foreign gene under the control of endogenous promoters.
By xe2x80x9cdifferent or foreign genexe2x80x9d is meant any nucleotide sequence corresponding to the totality or a part of a xe2x80x9cforeign or differentxe2x80x9d gene from the recipient gene such as is normally found in the genome (RNA or DNA), or it also correponds to an artificially modified sequence of the normal gene or also to a fragment of this sequence.
The invention also relates to the process for the production of these transgenic animals.
In the production of transgenic animals, the conventional methods used for the introduction of heterologous DNA sequences into the germinal cell line do not make it possible to control the site of integration of the foreign gene into the genome nor the number of copies thus introduced. The integration of the foreign gene occurs at random and, usually, several copies of the gene are integrated at the same time, sometimes in the form of a head-to-tail tandem, the site of integration and the number of copies integrated varying from one transgenic animal to another.
Thus, it may happen that endogenous cellular genes, situated at the point of insertion, are thus inactivated without this being easily detectable on account of the many random insertions. If the product of these genes is important for the development of the animal, the latter will be seriously perturbed. Moreover, the random insertion of the foreign gene may occur at a site which is not suitable for the expression of the gene. In addition, the fact that there may be variation in the site and in the number of insertions from animal to animal makes the interpretation of the studies of expression extremely difficult.
A major problem encountered in the production of transgenic animals is the obtaining of the expression of the foreign gene. Generally speaking, two types of experiment have been made in mice.
The genes introduced into the germ line are:
either xe2x80x9ccompletexe2x80x9d genes, comprising coding sequences flanked by their own regulatory sequences;
or composite genes, composed of the coding sequence of a gene fused to a promoter sequence of another gene, the two fragments even sometimes belonging to two different animal species.
Thus, it has been possible to confirm that the specificity of the expression of the genes in this or that tissue is determined by their regulatory sequence(s).
The choice of the suitable promoter for the expression of the foreign gene in the transgenic animal is thus of primordial importance.
Furthermore, the directed mutagenesis of mouse genes in embryonic stem cells has recently been carried out by resorting to a technique of xe2x80x9cgene targettingxe2x80x9d(Thomas et al., 1987; Thompson et al., 1989).
In the first case, the mouse HPRT gene was mutated by insertion and replacement and, in the second case, a mutated HPRT gene was corrected. Thompson et al. have extended their experiments to the production of chimeric mice and have observed the passage of the genetic modification in the germ cell line.
In each of the documents cited, the precise site of integration was targetted by homologous recombination between, on the one hand, exogenous sequences bearing the mutation or correction included in a vector under the control of an exogenous promoter and, on the other hand, their genomic homologue. This being so, it should be noted that the earlier authors carried out their experiments on a specific gene (HPRT), the activation of which by mutation is accompanied by a detectable phenotype. The targetted mutation described by Thomas et al. had the effect of inactivating the HPRT gene and, consequently, of causing the normally detectable phenotype associated with the HPRT to disappear. The selection gene NeoR, under the control of a promoter TK, was thus incorporated into the DNA to be inserted in order to make possible the selection of the transformants. It is to be noted that the experiments described in the prior art implied a selection by means of the recipient gene (e.g. HPRT) or by means of the inserted gene (eg. NeoR). The site of the insertion and/or the type of gene inserted is thus limited to genes conferring a selectable character.
Furthermore, in the prior art, the exogenous sequences on the vector thus serve both to target the integration site and.to introduce the modification. Subsequent to homologous recombination, the modified gene is always found in its normal genetic environment.
Let it be recalled that a problem which arises in the course of the production of transgenic animals is the danger of inactivating an endogenous cell gene which is located at the point of insertion of the foreign gene.
Depending on the function of the product of the inactivated gene, such an inactivation may lead to extensive morphological or physiological disorders in the transgenic animal, or may even prevent its survival.
On the other hand, the inactivation of a gene might be considered to be advantageous if the gene in question codes for a receptor of a virus or other infectious agent.
The inventors have studied the possibility of avoiding the disadvantages described above and associated, in some cases, with the possible inactivation of one or several endogenous cell genes with an important function in the course of the production of transgenic animals.
The object of the invention is a process for specific replacement, in particular by targetting of a DNA, called insertion DNA, constituted by a part of a gene capable of being made functional, or the function of which may be made more effective, when it is recombined with a complementing DNA in order thus to supply a complete recombinant gene in the genome of a eucaryotic cell, characterized in that:
the site of insertion is located in a selected gene, called the recipient gene, containing the complementing DNA and in that
eucaryotic cells are transfected with a vector containing an insert itself comprising the insertion DNA and two so-called xe2x80x9cflankingxe2x80x9d sequences on either side of the DNA of insertion, respectively homoloqous to two genomic sequences which are adjacent to the desired insertion site in the recipient gene,
the insert ion DNA being heterologous with respect to the recipient gene, and
the flanking sequences being selected from those which constitute the above-mentioned complementing DNA and which allow, as a result of homologous recombination with corresponding sequences in the recipient gene, the reconstitution of a complete recombinant gene in the genome of the eucaryotic cell.
The invention also relates to a procedure for the production of transgenic animals, characterized in that E.S. cells are transfected under the conditions described above and selected for the homologous recombination event, namely the correct integration of the foreign gene, the transfected cells are injected into embryos at a stage at which they are capable of integrating the transfected cells (for example at the blastocyte stage), the latter are then reimplanted in a surrogate mother and the chimeric individuals obtained at the term of pregnancy are then mated. If the E.S. cells have colonized the germ line of the chimeric animal, transgenic animals heterozygous for the replaced gene will be obtained by mating (F1) in the progeny.
It is also possible to insert the gene, borne by the vector of the invention, into the egg shortly after (i.e. less than 24 hours) fertilization. In this manner, the insertion is effected while the egg is in the unicellular state.
The invention also relates to a plasmid capable of effecting the targetted insertion of a recombinant gene, called inserted gene, in the genome of a eucaryotic cell, characterized in that it contains an insert itself comprising the insertion gene and two so-called xe2x80x9cflankingxe2x80x9d sequences on either side of the insertion gene respectively homologous to the two genomic sequences which are adjacent to the desired insertion site in the recipient gene.
The invention also relates to transgenic animals in which at least one endogenous gene has been inactivated by the insertion of a gene which is different from the inactivated gene, the inserted gene being inserted in a position which makes possible the expression of this gene under the control of the regulatory sequences of the inactivated endogenous gene.
Hence, as a consequence of the phenomenon of homologous recombination, the process of the invention makes it possible to insert in a targetted manner foreign genes, in particular coding sequences lacking the promoter which is normally associated with them, into the genome of a eucaryotic organism at a site which allows their expression under the control of the endogenous promoter of the gene into which the insertion is made, and consequently, enables the targetted endogenous gene to be inactivated.
According to a preferred embodiment of the invention, the targetted recipient gene is a gene which is present in the genome in at least two copies. The utilization of the technique of electro-poration (Ref. 11) ensures the introduction of one copy only of the foreign gene.
According to this variant of the invention, the targetted insertion of the gene of interest (i.e. the so-called insertion gene) has the effect of inactivating only that copy of the cellular endogenous gene at which the insertion is made and of leaving intact and functional the other copy or copies of this gene.
In this manner, the genetic functioning of the transgenic animal is not or is only slightly perturbed by the introduction of the foreign gene, even if the insertion inactivates a single copy of a recipient gene essential for the development of the animal. Thus either its development would be not effected by the insertion of the foreign gene, or the minor perturbations possible in the case of the inactivation of a critical gene would probably not be lethal for the animal. The effects of the insertion of the foreign gene in the homozygous state could be of any kind and would be observed in the 2nd generation (F2) after cross breedings of heterozygous individuals (F1) among themselves.
If, on the contrary, the inactivation of all of the copies of a gene is desired, for example, in the case in which the gene codes for a receptor of an infectious agent, multiple copies of the foreign gene are introduced. The control of the quantity introduced may be ensured by having resort to known methods.
The targetted insertion of the foreign gene thus makes possible its introduction at a site at which its expression is under the control of the regulatory sequences of the endogenous gene where the insertion is made.
The process of the invention thus makes it possible to insert the foreign gene behind an endogenous promoter which has the desired functions (for example, specificity of expression in this or that tissue), and to do so, if necessary, without inactivating the other copies of the recipient gene.
According to a particularly preferred embodiment of the invention, the insertion DNA contains between the two flanking sequences, firstly a DNA sequence designed to be recombined with the complementing DNA in the recipient gene in order to provide a recombinant gene and, secondly, a sequence coding for a selective agent making possible the selection of the transformants and a promoter allowing the expression of the selective agent, the recipient gene and the recombinant gene coding for expression products which do not confer a selectable phonotype.
In this manner, the selection of the transformants is entirely independent of the nature of the recipient gene and of the inserted gene, in contrast to the procedures described hitherto in which the inserted gene or the recipient gene had, of necessity, to code for a product of expression making possible the selection of the transformants. The system developed by the inventors allows total flexibility with respect to the nature of the recipient gene and the inserted gene or the gene formed by homologous recombination. In a surprising manner, the inventors have observed that the insertion of sequences of considerable size (for example about 7.5 kb) does not effect the frequency of homologous recombination.
The effect that the insertion of the DNA sequence may have according to this aspect of the invention includes, for example, depending on the type of sequence inserted, the replacement of a coding sequence, the replacement of a regulatory sequence, the inactivation or reactivation of a gene by mutation or the improvement of the level of expression of a gene. It is possible, according to the invention, to replace a coding phase or a part of a coding phase by a heterologous sequence which commences at the initiation codon of the replaced gene in order that the expression of the inserted gene entirely replaces the expression of the replaced gene. This avoids the formation of fusion proteins which might be undesirable in a transgenic animal.
According to this embodiment of the invention, the inserted DNA may contain between the flanking sequences a heterologous coding sequence lacking a promoter, the coding sequence being other than a gene coding for a selection agent. The insertion DNA may contain in addition, downstream from the coding sequence and still between the flanking sequences, a gene coding for a selection agent, associated with a promoter making possible its expression in the target cell.
In this manner, the heterologous coding sequence may be inserted behind an endogenous promoter which has the desired properties, for example a certain specificity of expression, or range of transcription etc., the selectibility of the transformed cells being entirely independent of the expression of the heterologous coding sequence. This type of construction makes it possible, for example, to select the transformants even though the gene replaced by the heterologous coding sequence is not normally expressed in the target cells. This is particularly important in the production of transgenic animals from embryonic stem cells since a considerable proportion of the genes remain inactive until a more advanced stage of development of the animal. The Hox-3.1 gene is an example of this type of gene. Furthermore, if the coding sequence codes for an easily detectable protein, for example the xcex2-Gal, the development of the transcription pattern of the replaced endogenous gene may be monitored. The vector pGN is an example of this type of construction.
In accordance with another embodiment of the invention, the inserted DNA may contain a foreign regulatory sequence. The insertion site and, consequently, the flanking sequences are selected as a function of the desired purpose, namely either the insertion of the foreign regulatory sequence in order to give a xe2x80x9cdouble promoterxe2x80x9d effect with the endogenous regulatory sequence, or the replacement of an endogenous promoter by the foreign promoter. The coding sequence which is situated under the control of the regulatory sequence may be endogenous.
Another possibility would be the targetted insertion of a foreign DNA which contains both a regulatory sequence and a coding sequence. It is possible that the regulatory sequence is that which is naturally associated with the coding sequence.
The procedure of the invention makes use of a vector containing two xe2x80x9cflankingxe2x80x9d sequences, one on either side of the foreign gene. These flanking sequence have at least 150 base pairs and are preferably shorter than the length of the recipient gene. It is essential that the two flanking sequences be homologous with the two genomic sequences which are adjacent to the desired insertion site. The flanking sequence of the vector which is situated upstream from the foreign gene to be introduced is normally homologous to the genomic sequence which is situated on the 5xe2x80x2 side of the insertion site. Similarly, the flanking sequence of the vector which is situated downstream from the foreign gene is normally homologous to the genomic sequence which is situated on the 3xe2x80x2 side of the insertion site.
It is possible to introduce xe2x80x9cintercalatingxe2x80x9d sequences between one or other of the flanking sequences and the foreign gene, for example sequences making possible the selection of the transformants, markers, sequences making possible the cloning of the vector, etc.
The position of these intercalating sequences with respect to the foreign gene must, however, be selected so as not to prevent the expression of the foreign gene, in particular of the foreign coding DNA sequence under the control of the endogenous promoter or, inversely, the endogenous DNA coding sequence under the control of foreign regulatory elements supplied by the inserted sequence.
In spite of the presence of the flanking sequences, which promote homologous recombination, it is possible that a certain number of integrations occur at random. In order to verify that the targetted insertion has indeed occurred at the targetted site and not at another site, the technique of the xe2x80x9cPolymerase Chain Reactionxe2x80x9d (P.C.R.) (see Ref. 10) is used in order to amplify the DNA sequence of the locus at which the insertion should be made. In this manner, only the clones transformed following homologous recombination are selected.
The flanking sequences of the vector are quite obviously selected as a function of the desired insertion site so that the homologous recombination may take place. Where appropriate, the flanking sequences may contain replica sequences of the endogenous promoter and/or modifications to the sequences which precede the initiation codon in order to improve the level of translation (sequences upstream) and replica sequences of the termination sequences, in particular poly-adenylation sites (sequences downstream).
The insertion gene may be any gene of interest. Mention should be made, as non-limiting examples, of the lac.Z gene (as in the model described below), the genes coding for interleukin or interferon, the gene for the retinoic acid or 3-beta adrenergic or H.I.V. receptor, for example, and genes known to be associated with certain diseases, for example myopathy, etc.
In accordance with a preferred variant of the invention, the eucaryotic cells are embryonic stem cells (see Ref. 14 and 15).
In fact, a mutated E.S. cell may be injected into an immature embryo which, after reimplantation, will be born in a chimeric form. If the germ line is colonized by the mutated cell, the chimeric animal will transmit the mutation to its progeny. Subsequently, it will be possible to observe the effects of this mutation in the homozygous state in some individuals, on their development, their behaviour, their metabolism, their pathology, etc.
FIG. 1 shows the plasmid pGN.
FIGS. 2a and b show the molecules pGMA and pGMD, respectively, constructed from the plasmid pGN with respect to the Hox-3.1 gene. These plasmids are plasmids of mutagenesis. The two parts of the coding phase of the Hox-3.1 gene are represented on chromosome 15 by the black box xe2x80x9chomeoxe2x80x9d. The corresponding sequences of Hox-3.1 were cloned in the plasmid pGN. (A: polyadenylation signal; Enh/Pro: enhancer-promoter). 07 and 08 illustrate the two oligonuclueotides used in the PCR.
FIGS. 3 to 6 show the plasmids used in the construction of the pGN. FIG. 6 contains the following nucleotide sequences: SEQ ID NO: 14 CTGCAGGTCGACGGATCCGiGGGAATTCCC SEQ ID NO: 15 GGGGATCCCGTC SEQ ID NO: 16 AAATAATAATAACCGGGC SEQ ID NO: 17 AGGGGGGATCCGTCGACCTGCAG.
FIG. 7 illustrates the detection of homologous recombination with the Polymerase Chain Reaction (P.C.R) technique on transfected E.S. cells.
FIGS. 8(a) and (b) shows Southern analyses of individual positive clones (L5 and F2) and E.S. cells (C.C.E.).
FIG. 8C depicts a restriction map of E.S. cells containing the mutated Hox-3.1 gene (xe2x80x9crecxe2x80x9d) in comparison with that containing the wild-type locus (xe2x80x9cwtxe2x80x9d).
FIGS. 9A, 9B, and 9C depict chimeric embros at 9.5 and 10.5 days p.c.
The procedure of the invention is of very wide industrial application and may vary according to the nature of the foreign gene introduced.
The genetics of mammals will be able to make considerable progress as a result of the recent possibility of mutagenizing specifically any gene, thus making it possible to better define its role. By means of this technology which involves homologous recombinations and E.S. cells, valuable information will be provided concerning oncogenes, growth factors, transcription factors, etc. genes which concern very topical subjects in fundamental research or applied research. An important prospect for medical research is the possibility of reproducing a human disease whose genetic analysis is known (certain human diseases with pathology, such as Duchesne myopathy) in order to study its mechanisms better and to discover a treatment.
By applying the process of the invention, a gene known to be responsible for a certain disease is inserted in a targetted manner into the genome of a E.S. cell. The transgenic animal which is subsequently produced provides a useful model of this disease.
If necessary, and as described above, the normal genetic functions may be approximately maintained, in spite of the insertion of the foreign gene.
Another application of the process of the invention consists of inserting an insertion gene which is easily detectable e.g. the lac.Z gene and which can thus play the role of cell marker. In this manner, studies of lineage e.g. in animals entered in competitions are facilitated, and the pedigree may be monitered.
The insertion of the lac.Z gene as insertion gene also makes possible studies of the promoter. Owing to the possibility of detecting the xcex2-galactosidase activity, the activity and specificity of various endogenous promoters may be studied by targetting different sites in the same or different types of cells. It will be possible to carry out the same studies on a whole organism, during development, or in the adult state by using the techniques of chimeric or transgenic animals.
The inventors have made the surprising observation that the frequency of homologous recombination is not effected by the insertion of fragments of large size, for example the Lac.Z. This observation suggested to the inventors that the technique of homologous recombination would be well adapted to the insertion of other heterologous genes which are of large size.
Owing to the possibility of being able to modify the genome of an animal, the process of the invention may also be used as xe2x80x9cgene therapyxe2x80x9d. The most obvious uses would consist of inactivating the genes of receptors for infectious (viruses or bacteria) or toxic agents. If such mutagenesis were to prove lethal, it would be necessary to reestablish the lost function without reestablishing the sensitivity to the noxious agents. A modified gene coding for such a receptor could be reintroduced into the mutated cell provided that the modification could be brought about by homologous recombination. This modification of the genetic inheritance would confer on the animal an immunity against the disease under consideration.
This protocol may also be implemented in the context of auto-transplantation. Diseased or healthy cells taken from a patient could be treated and immunized, then reimplanted into the same individual.
The technique of the invention also lends itself to studies of the activity of pharmaceutical products presumed to have an activity towards the products of expression of a pathological gene associated with a disease. In this case, the inserted gene is constituted by the pathological gene and the pharmaceutical product is administered to the transgenic animal for the purpose of evaluating its activity on the disease.
The invention will be illustrated by making reference to the plasmid pGN and its use in the targetted insertion of a foreign gene (lac.Z, coding for the enzyme xcex2-galactosidase of E. coli) into the genome of a E.S. cell of mice. The lac.Z gene was selected on account of the fact that its expression may be easily detected and is simply used for purposes of illustration.
The coding phase of the xcex2-galactosidase enzyme of E. coli (lac.Z; 1-3057), fused with a genomic sequence (7292-3) of the mouse gene Hox. 3-1 (Ref. 1), starts with the initiation codon for this gene. In fact, the sequence which precedes the initiation codon of Hox-3.1 is identical with the consensus sequence observed in vertebrates (ref. 2), thus making possible an improved level of translation of xcex2-galactosidase in the cells of vertebrates. The lac. Z gene is followed by a polyadenylation signal of, for example the SV 40 virus, like most of the eucaryotic genes, in order to stabilize the messenger RNAs.
The activity of the xcex2-galactosidase of E. coli, which is functional in the eucaryotic cells, may be detected in different ways. Cells expressing the lac.Z gene take on a blue colour, after fixation in the presence of X-Gal, which is a substrate for xcex2-galactosidase (Ref. 3). A new substrate, the FDG (fluoroscein di-xcex2-galactopyranoside) makes it possible to detect and determine the xcex2-gal. activity while keeping the cells alive (Ref 4). The cells expressing lac.Z accumulate a fluorescent product and can be isolated with the aid of a cell sorter or FACS (fluorescence-activated cell sorter).
The transcription unit of the gene for resistance to neomycin is derived, in large part, from the plasmid pRSV neo (Ref. 5). The LTR (long terminal repeat) of the Rous sarcoma virus provides very powerful promoter and enhancer sequences in many eucaryotic cells (Ref. 6). From the bacterial transposon Tn5 are derived an active promoter in E. coli and the coding phase of the enzyme phosphotransferase (Ref. 7), which is followed by the polyadenylation signal of the SV40 virus. The same gene under the double control of the RSV and Tn5 promoters can confer resistance to neomycin or kanamycin on bacteria and resistance to G418 on eucaryotic cells.
As a result of the effect of a simple point mutation, the B unit of the enhancer sequences of the PyEC F9.1 strain of the polyoma virus became much more active in different types of cells, and in particular in embryo carcinoma (EC) cells (Ref. 8). Two copies of this enhancer Py F9.1 were inserted in tandem into the plasmid pGN, upstream from the LTR-RSV, and in the xe2x80x9clate promoterxe2x80x9d orientation of the regulatory region of polyoma.
In order to improve the level of translation of the phosphotransferase, the sequence preceding the initiation codon was modified during oligonucleotide mutagenesis. Thus the sequence T T C G C A U G became G C A C C A U G, corresponding much better to the consensus initiation sequence for translation in vertebrates (Ref. 2).
It was possible to evaluate the improvements introduced into the transcription unit of the gene for resistance to neomycin by transfecting embryonic stem cells (ES) of the mouse. At equal molarity of plasmid, a construction with the Py. F9.1 enhancers produced 7.5xc3x97 more resistant clones to G418 than the pRSV neo and 2 to 3xc3x97 more than the pMC1 Neo described by Capecchi et al (ref. 13). Again, the number of clones was increased 60xc3x97,xe2x80x94that is 450xc3x97 compared to the pRSV neo, by modifying the initiation sequence of translation. Homologous recombination may be a quite rare event, depending on the experimental conditions used (p. ex 1/1000 for HPRT, ref. 13). A vector possessing a high efficacy of selection is thus very useful, all the more so since the conditions of electroporation mainly give rise to the integration of a single copy.
The pGN plasmid, contains, in addition, a bacterial origin of replication of the type colE1, pBR322, which makes the clonings and preparations in E. coli possible.
Finally, a multiple cloning site (M.C.S.), synthesized in vitro, which only contains unique sites of cleavage in pGN, was inserted upstream from lac.Z., in order to facilitate the uses of this plasmid.
The plasmid xe2x80x9cflankingxe2x80x9d sequences which produce homologous recombination are added to the extremities of the pGN plasmid after linearization of the plasmid upstream from lac.Z through a site of the MCS (see FIG. 2). In this case, the flanking sequences selected are homologous with the chromosomal sequences derived from Hox-3.1 subsequently required to engage in homologous recombination.
FIG. 2 places the molecule constructed from the plasmid pGN with respect to the Hox-3.1 gene. In this case, recombination between the plasmid and chromosomal sequences of Hox-3.1 would result in an insertion at the start of the coding phase of this gene, hence in its total inactivation.
The pGN plasmid brings together several advantages for this methodology which is applicable to any gene. Since the event of homologous recombination may be quite rare (of the order of 1 for 1000 non-homologous integrations), it is necessary to be able to analyse a large number of clones whose resistance to G418 is sufficiently high as to be expressed in any part of the genome. The modifications introduced into the transcription unit of the phosphotransferase completely solve these problems. The method of mutagenesis by homologous recombination corresponds to inactivating a gene by an insertion or a substitution, but the plasmid pGN offers the additional advantage of being able to substitute the expression of xcex2-galactosidase for that of the mutated gene. Finally, the MCS facilitates the clonings of genomic fragments.