The Mycobacterium genus includes major human pathogens such as M. leprae and M. tuberculosis, the agents responsible for leprosy and tuberculosis, which remain serious public health problems world-wide.
M. bovis and M. tuberculosis, the causative agents of tuberculosis, are intracellular facultative bacteria. Despite the major health problems linked to these pathogenic organisms, little is known about their exported and/or secreted proteins. In SDS-PAGE analyses of M. tuberculosis culture filtrate show at least 30 secreted proteins (1,19,38). Some of them have been characterized, their genes cloned and sequenced (7, 35, 37). Others, although they are immunodominant antigens of major importance for inducing protective immunity (2, 21), have not been completely identified. In addition, it is probable that a great number of exported proteins remain attached to the cell membrane and, consequently, are not present in culture supernatants. It has been shown that proteins located at the outer surface of various pathogenic bacteria, such as the 103 kDa Yersina pseudotuberculosis invasin (14) or the 80 kDa Listera monocytogenes internalin (10) play an important role in interactions with the host cells and, consequently, in pathogenicity as in the induction of protective responses. Thus, a membrane-bond protein could be important for M. tuberculosis infection as well as for the induction of a protective response against this infection. These proteins could certainly be of interest for the preparation of vaccines.
The BCG (Bacille CalmetteGuxc3xa9rin), an avirulent strain derived from M. bovis, has been widely used as vaccine against tuberculosis. It is also a very important vector for the construction of live recombinant vaccines, particularly because of its high immunogenicity. Consequently, the study of the molecular biology of mycobacteria is currently of great interest.
The development of new vaccines against pathogenic mycobacteria, or the improvement of available vaccines required the development of specific tools which make it possible to isolate or obtain immunogenic polypeptide sequences.
The inventors have defined and produced, for this purpose, new vectors allowing the screening of mycobacteria DNA sequences in order to identify, among these sequences, nucleic acids encoding proteins of interest.
Vectors have been defined for evaluating the efficacy of sequences for regulation of expression in mycobacteria.
The invention also relates to new mycobacteria polypeptides which may have been isolated by means of the preceding vectors and capable of entering into the production of compositions for the detection of a mycobacteria infection, or for protection against an infection due to mycobacteria.
The subject of the invention is therefore a recombinant screening and/or cloning and/or expression vector, characterized in that it replicates in mycobacteria, in that it contains
1) a replicon which is functional in mycobacteria;
2) a selectable marker;
3) a reporter cassette comprising
a) a multiple cloning site (polylinker),
b) a transcription terminator which is active in mycobacteria, upstream of the polylinker, and
c) a coding nucleotide sequence derived from a gene encoding a marker for expression and/or export and/or secretion of protein, said nucleotide sequence lacking its initiation codon and its regulatory sequences.
The marker for export and/or secretion is a nucleotide sequence whose expression followed by export and/or secretion depends on regulatory elements which control its expression.
xe2x80x9cSequences or elements for regulation of expressionxe2x80x9d is understood to mean a promoter sequence for transcription, a sequence comprising the ribosomebinding site (RBS), the sequences responsible for export and/or secretion such as the sequence termed signal sequence.
A first advantageous marker for export and/or expression is a coding sequence derived from the PhoA gene. Where appropriate, it is truncated such that the alkaline phosphates activity is, nevertheless, capable of being restored when the truncated coding sequence is placed under the control of a promoter and of appropriate regulatory elements.
Other markers for exposure and/or export and/or secretion may be used. There may be mentioned by way of examples a sequence of the gene for xcex2-agarase or for nuclease of a staphylococcus or for xcex2-lactamase of a mycobacterium.
The transcription terminator should be functional in mycobacteria. An advantageous terminator is, in this regard, the T4 coliphage terminator (tT4). Other terminators appropriate for carrying out the invention may be isolated using the technique presented in the examples, for example by means of the vector pJN3.
A vector which is particularly preferred for carrying out the invention is the plasmid pJEM11 deposited at CNCM (Collection Nationale de Cultures de Microorganismes in Parisxe2x80x94France) under the No. I-1375, on Nov. 3, 1993.
For the selection of the identification of mycobacteria nucleic acid sequences encoding products capable of being incorporated into immunogenic or antigenic compositions for the detection of a mycobacteria infection, the vector of the invention will comprise, in one of the polylinker sites, a nucleotide sequence from a mycobacterium in which the presence of regulatory sequences is being sought, which are associated with all or part of a gene of interest making it possible, when the vector carrying these sequences (recombinant vector) is integrated or replicates in a mycobacterium-type cellular host, to obtain the exposure at the level of the cell wall or membrane of the host, and/or export and/or secretion of the product of expression of the above mentioned nucleotide sequence.
The mycobacteria sequence in question may be any sequence for which attempts are made to detect if it contains elements for regulation of expression associated with all or part of a gene of interest and capable of allowing or promoting exposure at the level of the cell membrane of a host in which it might be expressed, and/or export and/or secretion of a product of expression of a given coding sequence and, by way of test, of the marker for export and/or secretion.
Preferably, this sequence is obtained by enzymatic digestion of the genomic DNA or of the DNA complementary to an RNA of a mycobacterium and preferably of a pathogenic mycobacterium.
According to a first embodiment of the invention, the enzymatic digestion of the genomic DNA or of the complementary DNA is carried out using M. tuberculosis. 
Preferably, this DNA is digested with an enzyme such as sau3A.
Other digestive enzymes such as ScaI, ApaI, ScaII, KpnI or alternatively exonucleases or polymerases, may naturally be used, as long as they allow fragments to be obtained whose ends may be inserted into one of the cloning sites of the polylinker of the vector according to the invention.
Where appropriate, digestions with different enzymes will be carried out simultaneously.
Preferred recombinant vectors for carrying out the invention are chosen among the following recombinant vectors deposited at CNCM on Aug. 8, 1994:
pExp53 deposited at CNCM under the No. I-1464
pExp59 deposited at CNCM under the No. I-1465
pExp410 deposited at CNCM under the No. I-1466
pExp421 deposited at CNCM under the No. I-1467.
The vectors of the invention may also be used to determine the presence of sequences of interest, according to what was stated above, in mycobacteria such as M. africanum, M. bovis, M. avium or M. leprae whose DNA or cDNA will have been treated with determined enzymes.
The subject of the invention is also a process for screening nucleotide sequences derived from mycobacteria, to determine the presence, in these sequences, of regulatory elements controlling the expression, in a cellular host, of nucleic acid sequences containing them, and/or exposure at the surface of the cellular host and/or export and/or secretion of the polypeptide sequences resulting from the expression of the abovementioned nucleotide sequences, characterized in that it comprises the following steps:
a) digestion of mycobacteria DNA sequences with at least one determined enzyme and recovery of the digests obtained,
b) insertion of the digests into a cloning site, compatible with the enzyme of step a), of the polylinker of a vector above,
c) if necessary, amplification of the digest contained in the vector, for example by replication of the latter after insertion of the vector thus modified into a determined cell, for example E. coli, 
d) transformation of cellular hosts by the vector amplified in step c), or in the absence of amplification, by the vector of step b),
e) culture of the transformed cellular hosts in a medium allowing visualization of the marker for export and/or secretion which is contained in the vector,
f) detection of the cellular hosts which are positive for the expression of the marker for exposure and/or export and/or secretion (positive colonies),
g) isolation of the DNA of the positive colonies and insertion of this DNA into a cell which is identical to that of step c),
h) selection of the inserts contained in the vector, which allow clones to be obtained which are positive for the marker for export and/or secretion,
i) isolation and characterization of the fragments of DNA of mycobacteria which are contained in these inserts.
The carrying out of this process allows the construction of DNA libraries containing sequences capable of being exported and/or secreted, when they are produced in recombinant mycobacteria.
Step i) of the process may comprise a step for sequencing the inserts selected.
Preferably, the vector used is the plasmid pJEM11 (CNCM I-1375) and the digestion is carried out by means of the enzyme sau3A.
According to a preferred embodiment of the invention, the screening process is characterized in that the mycobacteria sequences are derived from a pathogenic mycobacteria, for example from M. tuberculosis, M. bovis. M. avium, M. africanum or M. leprae. 
The subject of the invention is also the nucleotide sequences of mycobacteria selected after carrying out the process described above.
According to a specific embodiment of the invention, advantageous sequences are for example the mycobacteria DNA fragments contained in the vectors pIPX412 (CNCM I-1463 deposited on Aug. 8, 1994), pExp53, pExp59, pExp410 or pExp421.
When the coding sequence derived from the marker gene for export and/or secretion is a sequence derived from the PhoA gene, the export and/or secretion of the product of the PhoA gene, truncated where appropriate, is obtained only when this sequence is inserted in phase with the sequence placed upstream, which contains the elements controlling the expression and/or export and/or secretion which are derived from a mycobacteria sequence.
The subject of the invention is also recombinant mycobacteria containing a recombinant vector described above. A preferred mycobacterium is a mycobacterium of the M. smegmatis type.
M. smegmatis makes it possible, advantageously, to test the efficiency of mycobacteria sequences for controlling the expression and/or export and/or secretion of a given sequence, for example of a sequence encoding a marker such as alkaline phosphatase.
Another advantageous mycobacterium is a mycobacterium of the M. bovis type, for example the BCG strain currently used for vaccination against tuberculosis.
A subject of the invention is, moreover, a recombinant mycobacterium, characterized in that it contains a recombinant vector defined above.
The invention also relates to a nucleotide sequence derived from a gene encoding an exported M. tuberculosis protein, characterized in that it is chosen from the following sequences:
a sequence IA corresponding to the chain of nucleotides described in FIG. 6A, or a sequence IB corresponding to the chain of nucleotides described in FIG. 6B, or hybridizing under stringent conditions with these chains,
a sequence II comprising the chain of. nucleotides IA or IB and encoding an M tuberculosis P28 protein having a theoretical molecular weight of about 28 kDa and an observed molecular weight of 36 kDa, determined by denaturing acrylamide gel electrophoresis (SDS-PAGE)
a sequence III contained in the sequence IA or IB and encoding a polypeptide recognized by antibodies directed against the M. tuberculosis P28 protein,
a sequence IV comprising the regulatory sequences of the gene comprising the coding sequence IA or IB,
a sequence V corresponding to the chain between nucleotides 1 and 72 of the sequence IA or IB and corresponding to the signal sequence,
a sequence VI corresponding to the chain between nucleotides 62 to 67 of the sequence IA or IB,
a sequence VII corresponding to the chain between nucleotides 688 and 855 of the sequence IA or IB.
Also entering within the framework of the invention is an M. tuberculosis polypeptide characterized in that it corresponds to the amino acid chain VIIIA or to the chain VIIIB represented in FIGS. 6A and 6B respectively or in that it comprises one of these chains.
A preferred polypeptide is characterized in that it has a theoretical molecular weight of about 28 kDa determined according to the technique described in the examples.
The M. tuberculosis p28 protein has been characterized by its capacity to be exported and therefore potentially located across the bacterial plasma membrane or the cell wall. Furthermore, as shown in the sequences presented in FIG. 6, some peptide units of the sequence are repeated. For these reasons, the M. tuberculosis p28 protein is now most often designated as ERP protein and the gene containing the coding sequence for this protein is called either irsa gene or erp gene.
The theoretical molecular weight of the ERP protein, evaluated at 28 kDa, corresponds to an experimentally observed molecular weight of about 36 kDa (electrophonetic migration on a denaturing polyacrylamide gel (DOS-PAGE)).
Another advantageous polypeptide within the framework of the invention comprises part of the amino acid chain VIII or VIIIB previously described and immunologically reacts with antibodies directed against the M. tuberculosis p28 protein.
Preferably, such a polypeptide is, in addition, characterized in that it does not immunologically react with the M. leprae p28 protein.
Particularly advantageous amino acid sequences within the framework of the invention are the sequences comprising one of the following chains or corresponding to one of these chains in one or more copies:
PGLTS (SEQ ID NO:1), PGLT (SEQ ID NO: 2), PGLTP (SEQ ID NO:3), PALTN (SEQ ID NO: 4), PALTS (SEQ ID NO:5), PALGG (SEQ ID NO: 6), PTGAT (SEQ ID NO:7), PTGLD (SEQ ID NO: 8), PVGLD (SEQ ID NO: 9).
Other advantageous sequences are, for example, the signal sequence between the positions of nucleotides 1 and 72 of the sequence of FIGS. 6A or 6B or alternatively the sequence between nucleotides 688 and 855 which is capable of behaving like a transmembrane sequence.
These polypeptide sequences may be expressed in the form of recombinant polypeptides. In these recombinant polypeptides, they may be replaced in part especially as regards the sequences of 5 amino acids previously described, by sequences of interest obtained from mycobacteria or the pathogenic organisms, it being possible for this replacement to lead to the inclusion, inside the recombinant polypeptides, of the epitopes or the antigenic determinants of a pathogenic organism or of a protein of interest against which it might be sought to obtain antibodies.
Thus, the polypeptides of the invention, while optionally exhibiting themselves the antigenic or even immunogenic properties, may be used as advantageous carrier molecules for preparing, where appropriate, vaccines having varying properties.
The subject of the invention is also monoclonal antibodies or polyclonal sera directed against a polypeptide as defined above.
As regards monoclonal antibodies, they are preferably directed specifically against a polypeptide of the invention and do not recognize, for example, the M. leprae p28 protein.
The subject of the invention is also a composition for the in vitro detection of an M. tuberculosis infection, characterized in that it comprises a polypeptide defined above, which is capable of immunologically reacting with antibodies formed in a patient infected with M. tuberculosis. 
Another composition for the in vitro detection of an M. tuberculosis infection is characterized by a nucleotide sequence containing at least 9 nucleotides, which is derived from a sequence defined above, or a nucleotide sequence containing at least 9 nucleotides and hybridizing, under stringent conditions, with M. tuberculosis DNA and not hybridizing, under the same conditions, with M. leprae DNA, this sequence being a DNA or RNA sequence, which is labeled where appropriate.
The subject of the invention is also a prokaryotic or eukaryotic cellular host, characterized in that it is transformed by a nucleotide sequence as described in the preceding pages, under conditions allowing the expression of this sequence and/or its exposure at the level of the membrane of the cellular host and/or its export and/or its secretion from the abovementioned membrane.
Preferably, the cellular hosts are mycobacteria such as M. smegmatis or M. bovis BCG.
Other cellular hosts are for example E. coli, CHO, BHK, Spf9/Baculovirus cells, yeasts such as Saccharomyces cerevisiae, vaccinia virus.
The subject of the invention is also an immunogenic composition comprising a polypeptide as presented above or a cellular host as defined above.
The invention relates, moreover, to a vector for the screening and/or cloning and/or expression of nucleotide sequences which are functional in mycobacteria, and which is derived from a vector described above and characterized in that the coding sequence derived from a gene encoding a marker for export and/or secretion is replaced by a reporter gene or a reporter sequence.
Preferably, the reporter sequence or gene lacks its regulatory sequences, in particular its ribosome binding sequences and/or its sequences which allow the export and/or secretion of the marker produced when the vector is incorporated into a recombinant cellular host.
Preferably, the reporter sequence or gene contains the sequence encoding the lacZ gene or a part of this sequence which is sufficient for the polypeptide to exhibit a xcex2-galactosidase activity.
A preferred vector of the invention is characterized in that it comprises at one of the cloning sites of the polylinker, a chain of nucleotides comprising a promoter and, where appropriate, regulatory sequences, for example for anchorage at the surface, the export or even the secretion of a polypeptide which might be produced under the control of the promoter, for which it is desired to evaluate the capacity to promote or regulate the expression of a reporter nucleotide sequence in mycobacteria.
Preferred vectors are plasmids chosen from the plasmids pJEM12, pJEM13, pJEM14, or pJEM15 as represented in FIG. 12.
Such a vector may be used to evaluate the value of sequences for regulation of expression or of promoters, for example, the pAN, pblaF*, psul3, pgroES/EL1 sequences.
The invention also comprises a process for determining the activity of a sequence containing at one of the cloning sites of the polylinker a chain of nucleotides comprising a promoter and, where appropriate, regulatory sequences, for example for the exposure, export or even secretion of a polypeptide which might be produced under the control of the promoter in mycobacteria, characterized in that it comprises the steps of:
transforming a mycobacterium strain, for example M. smegmatis or M. tuberculosis, with a vector described above,
detecting the activity normally associated with the presence of the reporter gene or of the reporter sequence.