The present invention relates to methods for sequencing the Human Immunodeficiency Virus (HIV) nucleic acids. More specifically, the present invention relates to methods for obtaining information on the genetic sequences of HIV nucleic acid from a patient. That information can be used to genotype a HIV quasi-species present in the patient.
The detection of mutations conferring drug resistance in the HIV pol gene is significant in determining drug sensitivity of the virus. During the course of treatment of a disease, the infectious microorganism or virus, such as HIV, can become resistant due to a loss of sensitivity to the particular drug in use, which generally results in spread of the disease and increased morbidity. At the genetic level, important changes can occur within the virus in response to drug therapy. Specific changes in a nucleic sequence or nucleic acid sequences of the virus that correlate with drug resistance are defined as drug resistance mutations. The nucleic sequences may be target nucleic acid sequences that are affected by a drug or therapeutic agent. Such target nucleic acids may encode a viral protein, such as an enzyme.
Due to the emergence of drug resistance mutations, one should obtain information concerning the genetic sequence of target nucleic acids of the virus in the patient for proper diagnosis and for choosing an appropriate treatment. Once this information is obtained, failure of drug therapy can be monitored at the genetic level rather than waiting for the re-emergence or worsening of clinical symptoms. This may be accomplished by isolating the nucleic acid for the infectious organism (virus) from the patient, determining the sequence of the target nucleic acids of the organism, and identifying mutations known to confer drug resistance.
This approach can also be used to intelligently prescribe effective drug treatment. The nucleic acid sequence of the organism""s target nucleic acids can be obtained from the patient prior to treatment, and the organism""s resistance to a particular drug can be determined.
One way to obtain de novo sequence information for the target nucleic acids of an HIV quasi-species present in a patient is to obtain a sample of the patient""s plasma or tissue. The viral RNA or DNA from that sample is then extracted. If the genetic information is RNA, it should typically be reverse-transcribed into DNA. The DNA of the HIV target nucleic acid is then amplified by PCR, and the PCR products are sequenced. This sequence data can then be compared to a reference sequence for HIV and with all known drug resistance mutations.
Many RNA containing viruses, including HIV, rapidly mutate even in the absence of drug therapy. This is due to the lack of fidelity and proof-reading functions by the virus""s RNA polymerase or reverse transcriptase for retroviruses. For HIV reverse transcriptase, for example, the estimated spontaneous mutation rate is 3xc3x9710xe2x88x925 nucleotides per replication cycle (Mansky and Temin, J. Virol., 69:5087-94, 1995).
The frequent use of antiviral drugs in the treatment of HIV infection has led to the development of drug resistance in AIDS patients. In the case of HIV, the genetic sequence of the HIV pol gene (which encodes the viral protease and reverse transcriptase) is often the target nucleic acid (Wainberg and Friedland, J. Am. Med. Assn., 279:1977-93, 1998). Drug resistant HIV mutants have been isolated from infected individuals. The present inventors believe that a 1.57 kilobase (kb) region of the pol gene is a particularly important region containing clinically relevant mutations.
The high degree of enzyme-induced genetic variability, in addition to the selective pressures of drug therapy, makes genotypic assessment of HIV very complex. Typically, HIV infected individuals harbor multiple viral genotypes or quasi-species, whether due to random enzyme-induced mutations, drug resistance-related mutations, or a combination of such mutations. As drug resistant mutant HIV strains become more prevalent, individuals with no history of drug treatment are becoming infected with drug resistant viruses.
Presently, determining appropriate treatment of HIV infections does not typically involve genetic analysis of the HIV pol gene (protease and reverse transcriptase) from patient plasma HIV RNA. Thus, physicians typically can only diagnose drug resistance in a patient if the patient fails to respond to therapy. Moreover, without genetic analysis, if a patient is failing therapy, it is difficult, if not impossible, to determine for which drugs the patient is still sensitive. By isolating, amplifying, and sequencing the patient""s HIV pol gene from plasma, it will be possible to determine the number of drug resistance mutations and tailor further therapy accordingly.
Consequently, there exists a need for rapid, reliable methods for obtaining the de novo nucleic acid sequences from clinical samples from patients who are, or may be, infected with HIV. In addition to providing patient-specific genotype information for use in identifying an appropriate treatment and monitoring drug resistance, the public health community would benefit from rapid, standardized, and reliable sequence information to establish the significance and relevance of drug-associated resistance mutations.
Current HIV genotyping procedures include hybridization based assays using labeled oligonucleotide probes and xe2x80x9chome brewxe2x80x9d (internally created) sequencing based assays. Because of a high rate of mutation in HIV, technologies using labeled oligonucleotides to represent xe2x80x9cmutantxe2x80x9d or xe2x80x9cwild-typexe2x80x9d forms at a particular codon of the gene sequence will be adversely affected. For example, mutations which are not associated with drug resistance will frequently occur, and may affect the binding of either xe2x80x9cwild-typexe2x80x9d or xe2x80x9cmutantxe2x80x9d probes, giving an anomalous result. Therefore, de novo sequencing should be a more accurate way to represent genetic changes of these highly variable sequences. This is especially important for organisms such as HIV because of the inherent genetic variability due to the lack of proofreading activity of HIV reverse transcriptase. Since the understanding of HIV mutations and their association with drug resistance is continually being elucidated, obtaining the de novo sequence of the HIV pol gene from patient populations undergoing drug therapy is important in establishing the clinical relevance of drug resistance in HIV.
Although a variety of home brew sequencing based assays have been used in individual research labs, the present inventors are not aware of comprehensive, commercially available systems for determining the de novo sequence of infectious organisms. The general poor quality and lack of proper controls, seen with most home brew assays, have hindered generation of accurate data which are crucial for studying drug resistance.
Other HIV genotyping procedures involve polymerase chain reaction (PCR) using nested primers to amplify HIV nucleic acid sequences. The nested primer procedure typically requires a different set of primers for each PCR cycle, with each successive set of primers being selected to anneal within the fragment amplified by the prior PCR cycle. While this procedure is effective at amplifying a known sequence present at low copy number, the use of multiple sets of primers, each of which must hybridize successively to a target sequence in the gene of interest, can result in a loss of the ability to amplify highly variable nucleic acid sequences. This would occur, for example, where a mutation was located in any of the target sequences in a region where primers are designed to hybridize. This results in biased selection of HIV quasi-species wherein significant drug resistance mutations may remain entirely undetected until drug failure occurs in the patient.
One goal of HIV genotyping is to monitoring drug resistance at the genetic level by identifying as many different HIV quasi-species as possible. Among such quasi-species, there are likely to be mutations due to drug resistance as well as random mutations due to polymerase error in the virus population. The number of HIV quasi-species detectable by the genotyping assay should be maximized. Therefore, a need also exists for genotyping procedures which detects many different HIV quasi-species from a rapidly mutating virus population.
An object of the invention is to provide methods for obtaining de novo sequence information for different HIV quasi-species present in patient""s samples. This invention will allow effective diagnosis and treatment for patients and will also provide methods for monitoring drug therapy failures at the genetic level. Another object of the instant invention is to provide a standardized assay which will allow for rapid and accurate identification of mutations associated with drug resistance.
According to certain preferred embodiments, the inventors have achieved improved sensitivity and determined a greater number of HIV quasi-species than procedures that employ nested primers. Certain embodiments involve a stream-lined assay, including a single-tube two step amplification procedure, coupled with automated sequence analysis and correlation with known drug resistance mutations. Such embodiments provide rapid, reliable assays that have acceptable sensitivity and specificity.
Additional objects and advantages of the invention will be set forth in part in the description which follows, and in part will be apparent from the description, or may be learned by practice of the invention.
According to certain embodiments, the invention comprises methods for amplifying a target nucleic sequence of HIV including combining a double-stranded nucleic acid template derived from HIV with certain specific PCR primers, a temperature-stable DNA polymerase, and deoxyribonucleotides, and amplifying the template to produce amplified target sequences. Certain embodiments include analysis of the amplified sequences. In certain embodiments, target sequence analysis is accomplished using a computer program which determines target gene sequences and, then compares those sequences with HIV reference sequences and a table of known drug resistance mutations.
In other embodiments, the invention comprises methods for sequencing HIV nucleic acid including combining a double-stranded nucleic acid template derived from HIV and one or more specific sequencing primers, amplifying the HIV derived double-stranded nucleic acid template to produce amplified sequencing products, separating those amplified sequencing products to obtain nucleic acid sequence data, and analyzing the nucleic acid sequencing data.
In yet other embodiments of the present invention, target nucleic acid sequencing involves use of particular dye-terminator chemistry. Such chemistry is useful for automated sequence analysis and determination of heterozygosity at a given nucleotide.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
The references discussed or cited in this application are all specifically incorporated by reference into this application.
For use in the present invention, typically, samples from infected or potentially infected patients are used as a source of HIV. Tissue samples or body fluid samples can be used. As used herein, body fluids include, but are not limited to, whole blood, plasma, serum, peripheral blood mononuclear cells (PBMC), other fractionated blood products, tears, saliva, semen, vaginal secretions, serous and pleural cavity fluids, washes from various mucous membranes such as the eye, nose or throat, and the like. Plasma, which may be obtained by methods well known in the art, is exemplary of a preferred source of proviral nucleic acid and will be used in the following examples.
According to certain embodiments, plasma samples, which may be frozen prior to use, are centrifuged under conditions that results in pelleting of the virion from the sample. After discarding the supernatant, the virus is lysed with lysis buffer. A particular buffer that may be used includes 4 M guanidine thiocyanate, 25 mM sodium citrate, 0.5% sodium lauroyl sarcosine, 100 mM dithiothreitol, 80 xcexcg/ml glycogen in sterile water. Viral RNA can be precipitated using absolute isopropanol and can be pelleted by centrifugation. The resulting supernatant is discarded and the pellet is washed and again pelleted by centrifugation. The wash solution can include 70% ethanol. HIV RNA samples are resuspended in an RNA diluent. According to certain embodiments, such a diluent can include 10 ng/ml polyrA from Pharmacia in RNase-free water. All steps can be performed using prechilled instruments and buffers, and the samples can be maintained at 4xc2x0 C. or stored at xe2x88x9270xc2x0 C.
Double stranded DNA template is then created from the prepared HIV RNA. According to certain embodiments, random hexamers are used as primers at the 3xe2x80x2 end in conjunction with a reverse transcriptase (RT) procedure. An advantage of using random hexamers is that sequence variability will not affect cDNA generation. Random hexamers may also stabilize RNA secondary structure, which is known to be quite significant in HIV pol RNA. Appropriate conditions are used that result in cDNA generation from purified HIV RNA. In certain preferred embodiments, the double stranded templates produced in the RT reaction are then amplified in a PCR amplification method.
Certain embodiments of the present invention also involve sequencing particular regions of HIV pol. The present inventors targeted various regions of HIV pol, including particular 1.57 kb and 2.1 kb regions. To attempt to locate mutations located at the 3xe2x80x2 end of HIV pol, primers were designed and tested to amplify a 2.1 kb region containing the entire coding region of protease and reverse transcriptase enzymes from HIV pol This assay will be used to amplify and sequence the AIDS Clinical Testing Group (ACTG) 320 study, which will document the mutation patterns and clinical relevance of mutations in HIV pol in response to multi-drug therapy (protease and reverse transcriptase inhibitors). In addition to the 1.57 kb region, an additional 3xe2x80x2 RT region (approximately 0.7 kb) was targeted for amplification and sequencing. These two amplification and sequencing protocols cover the entire 2.1 kb of HIV protease and reverse transcriptase for the ACTG 320 study.
Novel PCR primers were developed and used to amplify the 0.7 kb, 1.57 kb, or 2.1 kb regions of HIV pol The unique PCR primers hybridize to highly conserved regions of pol. According to certain embodiments, the PCR application employs a hot start enzyme, such as AmpliTaq Gold. The novel HIV pol amplification primers are:
0.7 kb primers-GGACTGTCMTGACATACAGMGTTAGTGG (SEQ ID NO:3), and
GGTTAAAATCACTAGCCATTGCTCTCC (SEQ ID NO:4);
1.57 kb primers-GGAAAAGGGCTGTTGGAAATGTG (SEQ ID NO:1), and
GGCTCTTGATAAATTTGATATGTCCATTG (SEQ ID NO:2),
2.1 kb primers-CTCATGTTCATCTTGGGCCTTATCTATTC (SEQ ID NO:13), and either
GCCAGGGAATTTTCTTCAGAGCAG (SEQ ID NO:12), or
GGCCAGGGAATTTTCTTCAGAGC (SEQ ID NO:14).
According to certain embodiments, sequencing procedures can employ one or more novel sequencing primers specific for the amplified HIV pol fragments. The novel HIV pol sequencing primers are:
AGCCAACAGCCCCACCAG (SEQ ID NO:5),
CCATCCCTGTGGAAGCACATTG (SEQ ID NO:6),
GTTAAACAATGGCCATTGACAGAAGA (SEQ ID NO:7),
GGMCTGTATCCTTTAGCTTCCC (SEQ ID NO:8),
AAMTGCATATTGTGAGTCTG (SEQ ID NO:9),
GMGAAGCAGAGCTAGAACTGGCAG (SEQ ID NO:10), and
AAGAAGCAGAGCTAGAACTGGGAGA (SEQ ID NO:11).
The following sequencing primers are also included in sequencing reactions, where appropriate:
GGGCCATCCATTCCTGGC (SEQ ID NO:15),
TGGAAAGGATCACCAGCAATATTCCA (SEQ ID NO:16), and
CTGTATTTCTGCTATTAAGTCTTTTGATG (SEQ ID NO:17).
To sequence the 0.7 kb HIV pol region one can use the primers:
AATGCATATTGTGAGTCTG (SEQ ID NO:9), and either
GAAGAAGCAGAGCTAGAACTGGCAG (SEQ ID NO:10), or
AAGAAGCAGAGCTAGAACTGGCAGA (SEQ ID NO:11).
Because these latter two sequencing primers appear to work equally well and provide similar results they can be used interchangeably. To obtain the sequence of the 0.7 kb region, one of the two sequencing primers is added to an aliquot of the 0.7 kb amplified target sequences and the second sequencing primer is added to a separate aliquot and the sequencing reaction is completed as described. The sequencing results from these two separate sequencing experiments are then combined and analyzed to provide the sequence for the 0.7 kb region.
Seven sequencing primers can be used to sequence the 1.57 kb HIV pol region:
AGCCAACAGCCCCACCAG (SEQ ID NO: 5),
CCATCCCTGTGGAAGCACATTG (SEQ ID NO:6),
GTTAAACAATGGCCATTGACAGAAGA (SEQ ID NO:7),
GGAACTGTATCCTTTAGCTTCCC (SEQ ID NO:8),
GGGCCATCCATTCCTGGC (SEQ ID NO:15),
TGGAAAGGATCACCAGCMTATTCCA (SEQ ID NO:16), and
CTGTATTTCTGCTATTAAGTCTTTTGATG (SEQ ID NO:17).
The procedure is the same as for the 0.7 kb region except that seven separate sequencing reactions, one for each sequencing primer, are performed and the seven sets of sequencing data are combined for analysis.
To sequence the 2.1 kb region, both the 0.7 kb sequencing primers (SEQ ID NO:15, and 16 or 17) and the 1.57 kb sequencing primers (SEQ ID NO:5, 6, 7, 8, 9, 10 and 11) can be used in nine separate sequencing reactions. The resulting nine sets of sequencing data are combined for analysis.
According to certain embodiments of the sequencing procedure, the new dye-terminator chemistries (dRhodamine and Big-Dye) were employed in place of dye-labeled primers. The new dye chemistries allow for more even incorporation of nucleotides and a much improved signal to noise ratio over the rhodamine terminators. The new Big-Dye terminator (U.S. Pat. No. 5,800,996) was chosen over the dRhodamine terminators because of the increased signal strength and better signal to noise ratio. This allows for a faster throughput for sequencing with increased data quality.
In certain embodiments, the target nucleic acid sequences are automatically analyzed by software that was developed for assigning an HIV genotype. The software incorporates two new features in the basecalling function: using known features of the sequences as previously determined from a set of standards (Conrad et al., 1995) and using the base identified on the complimentary strand to confirm the basecall. The use of experienced basecalling algorithms dramatically reduces the need for manual editing. The assembly of the basecalling primary data into a contiguous sequence is performed in a batch-wise manner by the software. The software then compares the derived sequence to a known HIV reference and table of known resistance mutations for genotypic assignment. Positions are reported that either differ in assignment by each of the sequence segments, differ from the HIV xe2x80x9cwild-typexe2x80x9d reference, or are found in the table of sequence mutations known to be correlated with drug resistance. This system can be coupled with the use of a sequence database for higher level analysis and data management.