The object of the present invention is mycobacterial proteins and microorganisms producing them.
It also relates to the use of these proteins in vaccines or for the detection of tuberculosis.
Tuberculosis continues to be a public health problem throughout the world. The annual number of deaths directly related to tuberculosis is about 3 million and the number of new cases of tuberculosis is about 15 million. This number of deaths due to tuberculosis is high even for the developed countries; for example in France it is of the order of 1500 per year, a figure which is certainly underestimated by a factor of 2 or 3 if Roujeau""s assessments of the differences between official figures and the results of systematic autopsies are taken into account. The recent increase in tuberculosis cases, or at least the leveling-off of the decrease in the frequency of this disease, must be considered in correlation with the development of the HIV/AIDS epidemic. In total, tuberculosis remains the leading infectious disease in terms of frequency in France and the developed countries, but above all in the developing countries for which it constitutes the principal source of human loss related to a single disease.
At present, a definite diagnosis made by the demonstration of cultivatable bacilli in a sample taken from the patient is only obtained in less than half the cases of tuberculosis. Even for pulmonary tuberculosis, which represents 80 to 90% of the tuberculosis cases, and which is the form of the disease for which the detection of the bacilli is the easiest, the examination of expectorations is only positive for less than half the cases.
The development of more sensitive techniques such as PCR (amplification by polymerase chain reaction), always comes up against the necessity for obtaining a sample. Women and children do not normally spit, and samples for infants frequently require relatively specialized medical intervention (for example ganglionic biopsy or sampling by lumbar puncture of the cephalo-rachidian fluid).
In other respects, inhibitions of the PCR reaction itself exist, of a type such that a sample may be unusable by this technique because of the impossibility of controlling its origins.
Finally, because of its limits of sensitivity (at the best of the order of 104 to 105 bacilli in the sample) the classic bacteriological diagnosis, microscopic examination and culture, requires that there has already been a relatively substantial development of bacilli and thus of the disease.
The detection of specific antibodies directed against Mycobacterium tuberculosis should thus be of assistance in the diagnosis of the common forms of the disease for which the detection of the bacilli themselves is difficult or impossible.
Successive generations of research workers have attempted to perfect a serological diagnostic technique for tuberculosis.
For a general review of studies carried out in this area, the application PCT WO-92/21758 may advantageously be referred to.
The techniques reported in the prior art are thus largely based on the preliminary isolation of proteins through their biochemical properties. It is not until after this isolation that the authors have tested the capacity of these proteins to detect those individuals affected by tuberculosis.
Application PCT WO-92/21758 describes a method for unambiguously selecting representative antigens of tubercular infection using serums originating from patients affected by tuberculosis or guinea-pigs immunized by live bacilli. This method, which is distinguished from the majority of the experiments described in the prior art, has led to the isolation of M. bovis proteins with molecular weights between 44.5 and 47.5 kD.
The seventeen amino acids of the N-terminal of one of these proteins were determined and are the following: (SEQ ID NO:5)
The article by ROMAIN et al. (1993, Infection and immunity, 61, 742-750) recapitulates the substance of the results described in this international application. It more particularly describes a competitive ELISA assay using a rabbit polyclonal immune serum obtained by immunizing rabbits against the 45-47 kD protein complex described above.
In parallel, a gene library from Mycobacterium tuberculosis has been created by JACOBS et al. (1991, Methods Enzymol., 204, 537-557).
This library contains a large number of different clones.
A protein from another Mycobacteria species, M. leprae, has moreover been identified by WIELES et al. (1994, Infection and Immunity, 62, 252-258). This protein, named 43 L, has a molecular weight deduced from the nucleotide sequence of about 25.5 Da. Its N terminal has 47% homology with that of the 45-47 kDa protein complex identified in Mycobacterium bovis BCG, and whose 17 amino acid sequence is given above.
As stated above, there is a major interest in human medicine, as much from the therapeutic as the diagnostic point of view, in accurately identifying the proteins produced by the Mycobacteria and in particular by M. tuberculosis. 
The problem which is in fact posed and is as yet unresolved lies in obtaining vaccines against a large number of diseases.
Another problem lies in the detection of diseases induced by the Mycobacteria, such as tuberculosis.
The applicant has thus pursued the determination of the sequence of a Mycobacterium tuberculosis protein, which is suspected of playing a major role in the immune response.
The applicant has demonstrated that the group of proteins corresponding to the 45-47 kD complex described above is coded by one and the same gene, and that the calculated molecular mass is different from the molecular mass estimated on polyacrylamide gel, because of its richness in proline.
The object of the present invention is thus a protein having at least a portion of one of the following sequences SEQ ID No 2 or SEQ ID No 3:
SEQ ID No 2:
Met His Gln Val Asp Pro Asn Leu Thr Arg Arg Lys Gly Arg Leu Ala Ala Leu Ala Ile Ala Ala Met Ala Ser Ala Ser Leu Val Thr Val Ala Val Pro Ala Thr Ala Asn Ala Asp Pro Glu Pro Ala Pro Pro Val Pro Thr Thr Ala Ala Ser Pro Pro Ser Thr Ala Ala Ala Pro Pro Ala Pro Ala Thr Pro Val Ala Pro Pro Pro Pro Ala Ala Ala Asn Thr Pro Asn Ala Gln Pro Gly Asp Pro Asn Ala Ala Pro Pro Pro Ala Asp Pro Asn Ala Pro Pro Pro Pro Val Ile Ala Pro Asn Ala Pro Gln Pro Val Arg Ile Asp Asn Pro Val Gly Gly Phe Ser Phe Ala Leu Pro Ala Gly Trp Val Glu Ser Asp Ala Ala His Phe Asp Tyr Gly Ser Ala Leu Leu Ser Lys Thr Thr Gly Asp Pro Pro Phe Pro Gly Gln Pro Pro Pro Val Ala Asn Asp Thr Arg Ile Val Leu Gly Arg Leu Asp Gln Lys Leu Tyr Ala Ser Ala Glu Ala Thr Asp Ser Lys Ala Ala Ala Arg Leu Gly Ser Asp Met Gly Glu Phe Tyr Met Pro Tyr Pro Gly Thr Arg Ile Asn Gln Glu Thr Val Ser Leu Asp Ala Asn Gly Val Ser Gly Ser Ala Ser tryr Tyr Glu Val Lys Phe Ser Asp Pro Ser Lys Pro Asn Gly Gin Ile Trp Thr Gly Val Ile Gly Ser Pro Ala Ala Asn Ala Pro Asp Ala Gly Pro Pro Gin Arg Trp Phe Val Val Trp Leu Gly Thr Ala Asn Asn Pro Val Asp Lys Gly Ala Ala Lys Ala Leu Ala Glu Ser Ile Arg Pro Leu Val Ala Pro Pro Pro Ala Pro Ala Pro Ala Pro Ala Glu Pro Ala Pro Ala Pro Ala Pro Ala Gly Glu Val Ala Pro Thr Pro Thr Thr Pro Thr Pro Gin Arg Thr Leu Pro Ala
SEQ ID No 3:
Asp Pro Glu Pro Ala Pro Pro Val Pro Thr Thr Ala Ala Ser Pro Pro Ser Thr Ala Ala Ala Pro Pro Ala Pro Ala Thr Pro Val Ala Pro Pro Pro Pro Ala Ala Ala Asn Thr Pro Asn Ala Gin Pro Gly Asp Pro Asn Ala Ala Pro Pro Pro Ala Asp Pro Asn Ala Pro Pro Pro Pro Val Ile Ala Pro Asn Ala Pro Gln Pro Val Arg Ile Asp Asn Pro Val Gly Gly Phe Ser Phe Ala Leu Pro Ala Gly Trp Val Glu Ser Asp Ala Ala His Phe Asp Tyr Gly Ser Ala Leu Leu Ser Lys Thr Thr Gly Asp Pro Pro Phe Pro Gly Gin Pro Pro Pro Val Ala Asn Asp Thr Arg Ile Val Leu Gly Arg Leu Asp Gin Lys Leu Tyr Ala Ser Ala Glu Ala Thr Asp Ser Lys Ala Ala Ala Arg Leu Gly Ser Asp Met Gly Glu Phe Tyr Met Pro Tyr Pro Gly Thr Arg Ile Asn Gin Glu Thr Val Ser Leu Asp Ala Asn Gly Val Ser Gly Ser Ala Ser Tyr Tyr Glu Val Lys Phe Ser Asp Pro Ser Lys Pro Asn Gly Gln Ile Trp Thr Gly Val Ile Gly Ser Pro Ala Ala Asn Ala Pro Asp Ala Gly Pro Pro Gin Arg Trp Phe Val Val Trp Leu Gly Thr Ala Asn Asn Pro Val Asp Lys Gly Ala Ala Lys Ala Leu Ala Glu Ser Ile Arg Pro Leu Val Ala Pro Pro Pro Ala Pro Ala Pro Ala Pro Ala Glu Pro Ala Pro Ala Pro Ala Pro Ala Gly Glu Val Ala Pro Thr Pro Thr Thr Pro Thr Pro Gln Arg Thr Leu Pro Ala
The invention also relates to hybrid proteins having at least a portion of the sequences SEQ ID No 2 or SEQ ID No 3 and a sequence of a peptide or a protein able to induce an immune response in man or in animals.
Advantageously, the antigenic determinant is such that it is able to induce a humoral and/or cellular response.
Such a determinant may be of a diverse nature and notably an antigenic protein fragment, advantageously a glycoprotein, utilized in order to obtain immunogenic compositions able to induce the synthesis of antibodies directed against multiple epitopes.
These hybrid molecules may also be constituted in part by a molecule carrying the sequences SEQ ID No 2 or SEQ ID No 3 combined with a portion, in particular an epitope, of diphtheria toxin, tetanus toxin, the HBS antigen of the HBV virus, the VP1 antigen of the poliomyelitis virus or any other viral toxin or antigen.
The processes for synthesizing the hybrid molecules include the methods used in genetic engineering for producing hybrid DNA coding for the required protein or peptide sequences.
The present invention also includes proteins having secondary differences or limited variations in their amino acid sequences which do not functionally modify them by comparison with the proteins having the sequences SEQ ID No 2 and SEQ ID No 3, or with hybrid proteins containing at least a portion of these sequences.
It should be noted that the present invention has revealed a very large difference in molecular weight between the weights calculated for the protein corresponding to the sequence SEQ ID No 3, which is of 28779 Da, and that of the complex, evaluated by SDS gel, which is of the order of 45-47 kD. This difference is probably due to the high frequency (21.7%) of proline in the polypeptide chain.
Other objects of the invention are oligonucleotides, RNA or DNA, coding for the proteins defined above. One such nucleotide has advantageously at least a portion of the following sequence SEQ ID No 1:
GT GCTCGGGCCC AACGGTGCGG GCAACTCCAC CGCCCTGCAT GTTATCGCGG GGCTGCTTCG CCCCCGACGC GGGCTTGGTA CGTTTGGGGG ACCGGGTGTT GACCGACACC GAGGCCGGGG TGAATGTGGC GACCCACGAC CGTCGACTGC GGCTGCTGTT GCAAGACCCG TTGTTGTTTC CACACCTGAG CGTGGCCAAA AACGTGGCCT TCGGACCACA ATGCCGTCGC GGGATGTTTG GGTCCGGGCG CGCGCTAGGA CAAGGGCGTC GGCACTGCGA TGGCTGCGCG AGGTGAACTC CGAGCAGTTC GCCGACCGTA AGCCTCGTCA GCTATCCGGG GGCCAAGCCC AGCGCGTCGC CATCGCGCGA GCGTTGGCGG CCGAACCGGA TGTGTTGCTG CTCGACGAGC CGCTGACCGG ACTCGATGTG GCCGCGGCCG CGGGTATCCG TTCGGTGTTG CGTAGTGTCG TCGCGAGGAG CGGTTGCGCG GTAGTCCTGA CGACCCATGA CCTGCTGGAC GTGTTCACGC TGGCCGACCG GGTATTGGTG CTCGAGTCCG GCACGATCGC CGAGATCGGC CCGGTTGCCG ATGTGCTTAC CGCACCTCGC AGTCGTTTCG GAGCCCGTAT CGCCGGAGTC AACCTGGTCA ATGCCACCAT TGGTCCGGAC GGCTCGCTGC GCACCCAGTC CGGCGCCCAC TGGTACGGCA CCCCGGTCCA GGATTTGCCT ACTGGGCATG AGGCAATCGC GGTGTTCCCG CCGACGGCGG TGGCGGTGTA TCCGGAACCG CCGCACGGAA GCCCGCGCAA TATCGTCGGG CTGACGGTGG CGGAGGTGGA TACCCGCGGA CCCACGGTCC TGGTGCGCCG GCATGATCAG CCTGGTGGCG CGCCTGGCCT TGCCGCATGC ATCACCGTCS ATGCCGCCAC CGAACTGCGT GTGGCGCCCG GATCGCGCGT GTGGTTCAGC GTCAAGGCGC AGGAAGTGGC CCTGCACCCG GCACCCCACC AACACGCCAG TTCATGAGCC GACCCGCGCC GTCCTTGCGT CGCGCCGTTA ACACGGTAGG TTCTTCGCCA TGCATCAGGT GGACCCCAAC TTGACACGTC GCAAGGGACG ATTGGCGGCA CTGGCTATCG CGGCGATGGC CAGCGCCAGC CTGGTGACCG TTCCGGTGCC CGCGACCGCC AACGCCGATC CGGAGCCAGC GCCCCCGGTA CCGCCTCGCC CCGCCTCGCC GCCGTCGACC GCTGCAGCGC CACCCGCACC GGCGACACCT GTTGCCCCCC CACCACCGGC CGCCGCCAAC ACGCCGAATG CCCAGCCGGG CGATCCCAAC GCAGCACCTC CGCCGGCCGA CCCGAACGCA CCGCCGCCAC CTGTCATTGC CCCAAACGCA CCCCAACCTG TCCGGATCGA CAACCCGGTT GGAGGATTCA GCTTCGCGCT GCCTGCTGGC TGGGTGGAGT CTGACGCCGC CCACTTCGAC TACGGTTCAG CACTCCTCAG CAAAACCACC GGGGACCCGC CATTTCCCGG ACAGCCGCCG CCGGTGGCCA ATGACACCCG TATCGTGCTC GGCCGGCTAG ACCAAAAGCT TTACCCCACC GCCGAAGCCA CCGACTCCAA CCCCGCGGCC CGGTTGGGCT CGGACATGGG TGAGTTCTAT ATGCCCTACC CGGGCACCCG GATCAACCAG GAAACCGTCT CGCTCGACCC CAACCCGGTC TCTGGAAGCC CGTCGTATTA CGAAGTCAAG TTCAGCGATC CGAGTAAGCC GAACGGCCAG ATCTGGACGG GCGTAATCGG CTCGCCCGCG GCGAACGCAC CGGACGCCGG GCCCCCTCAG CGCTCGTTTG TGGTATGGCT CGGGACCGCC AACAACCCGG TGGACAAGGG CGCGOCCAAG GCGCTGGCCG AATCGATCCG GCCTTTGGTC GCCCCGCCGC CGCGGCCAAG GCGCTGGCCG AATCGATCCG GCCTTTGGTC GCCCCGCCGC GGGGAAGTCG CTCCTACCCC GACGACACCG ACACCGCAGC GGACCTTACC GGCCTGACC
The present invention also relates to a microorganism producing one of the proteins such as are described above and in particular a microorganism secreting such an protein.
The microorganism is preferentially a bacterium such as Mycobacterium bovis BCG. These bacteria are already used in man in order to obtain an immunity against tuberculosis.
The production of hybrid proteins according to the present invention in M. bovis BCG has specific advantages. M. bovis BCG is a strain widely used for vaccination purposes and which is accepted as being innocuous to man. After injection into the human body it develops slowly over 15 days to 1 month, which leads to excellent presentation of the antigen against which a response is desired from the organism.
On the other hand Mycobacterium leprae, which is the agent of leprosy in man, is little known. This bacterium has not up till now been able to be cultivated on a culture medium and has a very long growth period by comparison with M. bovis. 
Its potential pathogenicity is moreover an obvious argument for not using it for vaccination purposes.
Proteins with the sequences SEQ ID No 2 or SEQ ID No 3 have the advantage of being recognized by the antibody present in tuberculosis patients and thus constitute a priori highly immunogenic antigens.
The proteins originate from M. tuberculosis, which is a species very close to M. bovis, these two bacteria being responsible for tuberculosis in man and cattle respectively.
The proteins originating from M. tuberculosis are thus able to be expressed in M. bovis and to be excreted in the culture medium by cells possessing a signal peptide.
Since M. bovis has the advantages listed above for vaccination in man and since in addition the proteins corresponding to the SEQ ID No 2 and SEQ ID No 3 sequences induce a strong immune response in man, it is especially advantageous to produce hybrid proteins in M. bovis which carry a portion of the proteins originating from M. tuberculosis. 
It is well known that the pathogenic microbial antigens against which a vaccination is being sought can only induce a very weak response in man unless they are presented in a specific manner.
The present invention resolves this problem in two ways
on the one hand by presenting the hybrid protein on the surface of M. bovis BCG, and/or excreted by the bacteria
and on the other by combining an antigenic determinant known to induce a strong immune response, i.e. the antigenic determinant of one of the proteins with SEQ ID No 2 or SEQ ID No 3 with an antigenic determinant inducing a weak response when it is injected alone.
The combination of the antigenic determinant of one of the proteins SEQ ID No 2 or SEQ ID No 3 allows an amplification of the immune response against the second antigenic determinant of the hybrid protein. This phenomenon can perhaps be compared to the hapten carrier effect.
It is clear that such an operation cannot be envisaged with a protein originating from M. leprae, such as that described in the article by Wieles et al. (1994, cited above), since on the one hand because of the much larger difference between M. tuberculosis and M. leprae, such a protein might not be properly expressed, and on another the immune response induced by this M. leprae protein is less well known. In addition the introduction of a protein from a pathogenic species for vaccination purposes constitutes a potential risk to human health which the pharmaceutical industry is reluctant to accept.
All these arguments contribute to a distinction between the protein sequences SEQ ID No 2 and SEQ ID No 3 and the M. leprae protein described by Wieles et al. (1994, cited above), despite their apparent sequence homologies (see later in FIG. 17).
The present invention also relates to vaccines or drugs containing at least one protein or microorganism such as those previously defined.
Vaccines containing nongrafted proteins may be used to immunize individuals against tuberculosis. Grafted proteins carrying an epitope originating from a biological agent other than M. bovis may be used for immunization against other diseases.
As an indication, 1 to 500 xcexcg. of protein per dose for an individual, or 103 to 10 7 recombinant bacteria per individual, may be used intradermally.
Another object of the present invention is a pharmaceutical composition containing at least a pharmaceutically effective quantity of a protein or a microorganism such as previously described in combination with pharmaceutically compatible diluents or adjuvants.
Another object of the present invention is a process for detecting the specific tuberculosis antibodies, in which a biological fluid, in which the presence of said antibodies is sought, is brought into contact with a protein such as that described above.
Advantageously, said protein is fixed on a support.
Such detection could in particular be implemented by the Western Blot (immuno-imprint) method, by an enzyme immunoassay method (ELISA) or a radioimmunoassay method (RIA), by use of an assay kit, containing the proteins as well as in particular buffer solutions allowing the immunological reaction to be carried out and if necessary substances allowing the antibody-antigen complex formed to be revealed.