1. Field of the Invention
The present invention relates to the diagnosis and analytical assays and is related to a method and kit comprising reagents and means for the identification, detection and/or quantification of a large number of (micro)organisms of different groups (classes, family, genus, species, individual among other ones) by their identification or the identification of a component thereof on a same array.
The invention is especially suited for the simultaneous identification and/or quantification of groups and sub-groups of (micro)organisms or related genes present in the same biological sample.
The present invention also provides a two step method for detecting first for the presence of any of the search (micro)organisms followed by its identification.
2. Description of the Related Art
Identification of an organism or microorganisms can be performed based on the presence in their genetic material of specific sequences. Identification of a specific organism can be performed easily by amplification of a given sequence of the organism using specific primers and detecting or identifying the amplified sequence.
However, in many applications especially in diagnostic, possible organisms present in biological samples are numerous and belong to different families, genus, species, subspecies or even individuals. Amplifications of each of the possible organisms is difficult and expensive. A simple method is thus required for such multi-parametric, multi-levels analysis.
Amplification of a given sequence is performed by several methods such as the polymerase chain reaction (PCR) (U.S. Pat. Nos. 4,683,195 and 4,683,202), ligase chain reaction (LCR) (Wu and Wallace, 1989, Genomics 4: 560–569) or the Cycling Probe Reaction (CPR) (U.S. Pat. No. 5,011,769) which are the most common. One particular way to detect for the presence of a given sequence and thus of a particular organism is to follow the appearance of amplicons during the amplicon cycles. The method is called the real time PCR. A fluorescent signal appears when the amplifications are formed and the amplification is considered as positive when reaching a threshold.
Detecting the amplicons can also be performed after the amplification by methods based on the specific recognition of amplicons to complementary sequences. The first supports used for such hybridization were the nitrocellulose or nylon membranes. However, the methods were miniaturized and new supports such as conducting surfaces, silica, and glass were proposed together with the miniaturization of the detection process. Microarrays or DNA Chips are used for multiple analysis of DNA or RNA sequences either after an amplification step or after a retro-transcription into a cDNA. The target sequences to be detected are labeled during the amplification or copying step and are then detected and possibly quantified on arrays. The presence of a specific target sequence on the arrays is indicative of the presence of a given gene or DNA sequence in the sample and thus of a given organism which may then be identified. The problem of detection becomes difficult when several sequences are homologous to each other, but have to be specifically discriminated upon the same array. This technical problem is the condition to use arrays for many diagnostic purpose since organisms or micro-organisms of interest are often very similar to others on a taxonomic basis and present almost identical DNA sequences.
The Company Affymetrix Inc. has developed a method for direct synthesis of oligonucleotides upon a solid support, at specific locations by using masks at each step of the processing. Said method comprises the addition of a nucleotides on growing synthesized oligonucleotides in order to obtain the desired sequences at the desired locations. This method is derived from the photolithographic technology and is coupled with the use of photoprotective groups, which are released before a new nucleotide is added (U.S. Pat. No. 5,510,270). However, only small oligonucleotides are present on the surface, and said method finds applications mainly for sequencing or identifying a pattern of positive spots corresponding to each specific oligonucleotide bound on the array. The characterization of a target sequence is obtained by cutting this polynucleotide into a small oligonucleotides and comparison of the hybridization pattern with a reference sequence. Said technique was applied to the identification of Mycobacterium tuberculosis rpoB gene (WO 97/29212), wherein the capture nucleotide sequence comprises less than 30 nucleotides and from the analysis of two different sequences that may differ by a single nucleotide (the identification of SNPs or genotyping). Small capture oligonucleotide sequences (having a length comprised between 10 and 20 nucleotides) are preferred since the discrimination between two oligonucleotides differing in one base is higher, when their length is smaller.
The method is complicated by the fact that it cannot directly detect amplicons resulting from genetic amplification (PCR). A double amplification is performed with primer(s) bearing a T3 or T7 sequences and then a retrotranscription with a RNA polymerase. These RNA are cut into pieces of about 40 bases before being detected on an array (example 1 of WO 97/29212). Each sequence requires the presence of 10 capture nucleotide sequences and 10 control nucleotide sequences to be identified on the array. The reason for this complex procedure is that long DNA or RNA fragments hybridize very slowly on small oligonucleotide capture nucleotide sequences present on the surface. Said methods are therefore not suited for the detection of homologous sequences, since the homology varies along the sequences and so part of the pieces will hybridize on the same capture nucleotide sequences. Therefore, a software for the interpretation of the results is incorporated in the method for allowing interpretation of the obtained data. The main reason not to perform a single hybridization of the amplicons on the array is that the amplicons will rehybridize in solution much faster than hybridize on the small capture nucleotide sequences of the array.
One consequence of such constraints is that polynucleotides are analyzed on oligonucleotides based arrays, only after being cut into oligonucleotides. For gene expression array which is based on the detection of cDNA copy of the mRNA, the problem still exist but is less acute since the cDNA is single stranded. The fragments are also cut into smaller species and the method requires the use of several capture oligonucleotide sequences in order to obtain a pattern of signals which attest the presence of a given gene. Said cutting also decreases the number of labeled nucleotides, and thus reduces the obtained signal. In the case of cDNA analysis, the use of long capture polynucleotide sequences gives a much better sensitivity to the detection. In many gene expression applications, the use of long capture nucleotide sequences is not a problem, when cDNAs to be detected originate from genes having different sequences, since the difference in the sequence is sufficient in order to avoid cross reactions between them even on a sequence longer than 100 bases so that polynucleotides can be used as capture nucleotide sequences. Long capture nucleotide sequences give the required sensitivity but they will hybridize to other homologous sequences.
The detection of Single Nucleotide Polymorphism in the DNA is just one particular aspect of the detection of homologous sequences. The use of arrays has been proposed to discriminate two sequences differing by one nucleotide at a particular location of the sequence. Since DNA or RNA sequences are in low copy numbers, their sequences are first amplified so that double stranded sequences are analyzed on the array. Several methods have been proposed to detect such a base change in one location. The document WO 97/31256 proposes the use of two oligonucleotide sequences: the first one with a part specific and a part addressable, the second one with a part specific and a part labeled. After ligation in solution, the product is immobilized on an array with capture nucleotide sequences with a least a part complementary of the addressable part. The detection of SNP is the basis for polymorphism determination of individual organism, but also for its genotyping, since the genome of individuals differ from each other in the same species or subspecies by said SNPs. The presence of particular SNP affect the activities of enzymes like the P450 and make them more or less active in the metabolism of a drug.
The capture oligonucleotide present on the array can also be used as primers for extension once the target nucleotide hybridized. The document WO 96/31622 proposes to identify a nucleotide at a given location upon a sequence by elongation of a capture nucleotide sequence with detectable modified nucleotides in order to detect the given spots, where the target has been bound with the last nucleotide of the capture nucleotide sequence being complementary of a target sequence at this particular position. The document WO 98/28438 proposes to complete several cycles of hybridization-elongation steps to label a spot in order to compensate for a low hybridization yield of the target sequence. This method allows identification of a nucleotide at a given location of a sequence by labeling of a spot of the elongated capture nucleotide sequence.
Prior to elongation, the capture nucleotide sequences present on the array can be digested by a nuclease in order to differentiate between matched and the unmatched heteroduplexes (U.S. Pat. No. 5,753,439). Use of nuclease for identification of sequences has also been proposed (EP 0721016). A second labeled nucleotide sequence complementary of the targets has also been proposed to be added to the hybridized targets and being ligate to the capture nucleotide sequence if the last nucleotide of the targets is complementary to the targets a this position (WO 96/31622).
The document EP-0785280 proposes a detection of polymorphism based on the hybridization of the target nucleotides on blocks containing several oligonucleotide sequences differing by one base each and obtain a ratio of intensity for determining which sequences are the perfect hybridization matches.
Using membranes or nylon supports are proposed to increase the sensitivity of the detection of polynucleotides on solid support by incorporation of a spacer between the support and the capture nucleotide sequences. Van Ness et al. (Nucleic Acids Research, vol. 19, p. 3345, 1991) describe a poly(ethyleneimine) arm for the binding of DNA on nylon membranes. The document EP-0511559 describes a hexaethylene glycol derivative as spacer for the binding of small oligonucleotides upon a membrane. When membranes like nylon are used as support, there is no control of the site of binding between the solid support and the oligonucleotides and it was observed that a poly dT tail increased the fixation yield and so the resulting hybridization (WO 89/11548).
Guo et al. (Nucleic Acids Research 22, 5456, 1994) teach the use of poly dT of 15 bases as spacer for the binding of oligonucleotides on glass with increased sensitivity of hybridization.
The publication of Anthony et al. (Journal of clinical microbiology, vol. 38:2, p. 7817–8820) describes the use of a membrane array for the detection of 23 S ribosomal DNA of various bacterial species after PCR amplification. Targets to detect are rDNA amplified from bacteria by consensus PCR and the detection is obtained on nylon array containing capture nucleotide sequences for said bacteria and having the capture nucleotide sequences having between 20 and 30 bases which are covalently linked to the nylon, and there is no control of the portion of the sequence which is available for hybridization. rDNA are multi-copies DNA which are used in order to compensate for the low detection yield of the method. Also, because of the use of small capture nucleotide sequences they can only detect individual bacterial species by their specific sequence and not the family or genus.
However these patents neither described nor suggested that it is was possible to use a component of a (micro)organism, especially a genetic sequence, to identify said (micro)organism together with the identification of the group to which these (micro)organisms belong. Also there is neither an indication nor a suggestion in the state of the art that polynucleotides can be used as capture sequences in microarrays in order to differentiate a binding between homologous polynucleotides sequences and to permit identification of one target sequence among other species, genus or families of (micro)organisms sequences.
Also there is no indication nor suggestion that homologous sequences differing by one nucleotide at one location of the sequence (such as observed in polymorphism analysis) could be detected by hybridization of the amplified sequences on corresponding capture nucleotide sequences.
Prior to the invention, it was unknown that it is possible to identify in a two step process, i.e. an amplification followed by a direct hybridization of the amplicons on an array, organisms belonging to the same group, to two groups or more together with the specific identification of the groups as such. Also it was unknown that it was possible to identify organisms belonging to a group and sub-group together with the specific identification of these group and sub-group. Also that such identification could be obtained by using polynucleotide as capture sequences for all detections.
Also it was unknown that polynucleotides could be used for the identification of homologous polynucleotide sequences differing by one nucleotide present in a particular location of the sequence.
Also it was unknown that homologous polynucleotide sequences could be discriminated and detected on an array directly after amplification with a very high sensitivity.