It is known that the mutations induced in the wild-type sequences of pathogenic organisms are responsible, for example, for therapeutic escape mechanisms, i.e., the capacity of viral or bacterial pathogenic organisms to resist a therapeutic treatment. The nucleotide and/or polypeptide sequences of the mutant strains of the organisms have particular mutations in relation to the nucleotide or polypeptide sequences of the wild-type strains.
Such mutations are also determinant of functional changes of the genes or proteins which have as a consequence the deterioration of numerous biological processes, such as the triggering of the immune response, infectivity of viruses, development of cancers, etc.
It is known, for example, that the genetic information of the human immunodeficiency virus (HIV), which belongs to the retrovirus family, is supported by two RNA molecules. Upon infection, integration of the viral genome with that of host cells can therefore not be implemented directly. The prior synthesis of a DNA copy from the genomic RNA of the virus is a determinant step of the infectious cycle. The enzyme responsible for this reverse transcription is a protein called Reverse Transcriptase (RT). The low reverse-transcriptional accuracy of this protein confers on the virus a large genomic variability. It is estimated that in an untreated serum-positive individual, one mutation appears per replication and, thus, for the ten billion viruses produced per day, there would be 10 billion new mutations. This mutation can lead to resistance to one or more antiretroviral agents and, thus, generate strains that are more virulent because they are increasingly resistant.
Faced with this problematic situation, practitioners prescribe very intense treatment regimens such as long-term triple drug combinations and, more recently, even quadruple drug combination and, perhaps even more in the future, profiting from the absence of resistant virus which characterize in general the patients who have not yet been treated and are infected by a single form of virus. These treatments then cause a strong diminution of the viral load, which is considered to be the quantity of viral particles circulating in the blood, the number of viral mutants which is directly proportional to the viral load diminishes as well, thereby reducing the risks of therapeutic escape.
These extremely intense treatments are unfortunately accompanied by numerous side effects. They moreover require perfect compliance which, if not respected, is accompanied almost systematically by the emergence of resistant strains. These selected resistances under the pressure of antiretroviral agents are at the origin of most of the therapeutic escapes.
Thus, although the choice of a combination of antiretroviral agents appears to be fundamental, the optimized combination of these agents does not appear to be obvious. In addition to the multiple problems posed by the resistances, which we have just described, the incompatibility of certain drug combinations and the constantly increasing number of antiretroviral agents makes the practitioner's work more and more difficult.
Physicians at present have available about twenty therapeutic agents essentially directed against two viral proteins—reverse transcriptase and protease. The most common therapeutic regimens involve triple drug combinations. A total of 252 possible combinations have been described—based only on the most common combinations. These calculations are statistical and do not take into account the different drug incompatibilities. Moreover, the appearance of new active ingredients stemming from pharmaceutical research will have the direct consequence of further complicating the problem of the selection of the drug combination.
The activity of other pathogenic organisms is also of concern: the flu virus was responsible for 20 million deaths during the 20th century and the Ebola virus emerged in an alarming manner. The hepatitis A, B, C, D and E viruses constitute veritable public health priorities both because of their Boolean status and their potential gravity.
In all of these cases, there is a therapeutic and vaccinal vacuum which increases each year because of the great mutability of the viral genomes, especially that of the retroviruses, RNA viruses such as HIV, flu, Ebola, hepatitis C, etc.
Many approaches have been proposed for attempting to resolve these multiresistance problems linked with the high degree of mutability of certain pathogenic organisms. The company Virco Tibotech, for example, developed a method directed by a computer program that enables comparison of a given genotype with a databank of HIV sequences. It then defines a list of the possible resistances to the antiretroviral agents.
Moreover, certain web sites such as that of the Los Alamos Library provide a large amount of data regarding the alignments of the HIV protein sequences as well as their mutations. This Library is provided online by the Division of AIDS of the National Institute of Allergy and Infectious Diseases (NIAID), a part of the National Institutes of Health (NIH).
Similarly, many publications by Ribeiro et al. disclose methods employing the calculation of the Boolean status of the appearance of resistant mutants using rather complex mathematic calculations.
Thus, methods for identifying the mutations of the constituent motifs of nucleotide or polypeptide sequences have been developed, e.g., those that made it possible during the 1980s to classify the immunoglobulins into classes and subclasses comprising constant domains and variable domains as a function of the variability of motifs of the different sequences that comprised them.
However, these methods do not enable identification of motifs whose mutation possibility is predetermined in relation to the set of sequences analyzed. This mutation possibility corresponds to a Boolean state of mutation.
It would therefore be advantageous to provide for the identification of multiple motifs the Boolean state of relative mutation of which is predetermined in relation to a set of given sequences. This method should be based on the identification either of motifs or combinations of motifs not ever having had mutated simultaneously, or motifs or combinations of motifs having mutated simultaneously at least once on at least one sequence of a set and not having mutated on other sequences of the set.