This application relates to a method, reagent and kit for genotyping of human papillomavirus, and in particular to the sequencing of human papillomavirus for determination of viral type.
Cancer of the cervix is one of the most common malignancies in women around the world. Over 90% of both invasive cervical cancer lesions and precursor lesions are associated with the presence of human papillomavirus (HPV), and many epidemiological studies have established that HPV infection is the major risk factor for squamous intraepithelial lesions and cervical carcinoma. Recently, the involvement of HPV in the etiology of cervical cancer has been extended to prostate cancer. Epidemiological studies have shown that men with HPV infections in their 20""s and 30""s are five times more likely to develop prostate cancer in their 50""s and 60""s.
In view of the potential significance of HPV infection, it would clearly be of interest to be able to routinely test samples for the presence of HPV. However, of the more than 54 genetic types of HPV which have been described (an HPV isolate is designated as a new xe2x80x9ctypexe2x80x9d when it has less than 90% nucleotide homology in the E6, Eand and L1 genes with previously characterized HPV types), only about 20% have been shown to be oncogenic. Thus, it is not sufficient to detect HPV. Meaningful diagnosis also requires the determination of the genetic type of any infecting virus.
U.S. Pat. No. 5,447,839, which is incorporated herein by reference, discloses a method for detection and typing of HPV. In this method, HPV DNA sequences in a sample are amplified by polymerase chain reaction (PCR) amplification using consensus primers which amplify both oncogenic and non-oncogenic HPV types. Thus, the presence of HPV in the sample is indicated by the formation of amplification products. HPV is then typed using type-specific DNA probes which hybridize with the amplified region of DNA. The type-specific hybridization probes disclosed in this patent are capable of identifying and distinguishing among five known oncogenic types of HPV, namely HPV-6, HPV-11, HPV-16, HPV-18 and HPV-33.
U.S. Pat. Nos. 4,849,331, 4,849,332, 4,849,334 and 4,908,306 which are incorporated herein by reference relate to HPV-35, HPV-43, HPV-44, and HPV-56. According to these patents, these types may be identified by hybridization with type-specific probes, although no actual sequences for such probes are disclosed.
Identification of other HPV types is discussed in Schiffman, et al. (1993). xe2x80x9cEpidemiologic evidence showing that human papillomavirus infection causes most cervical intraepithelial neoplasiaxe2x80x9d, J. Nat""l Cancer Inst. 85: 958-964; zur Hausen, H., (1994) xe2x80x9cMolecular pathogenesis of cancer of the cervix and its causation by specific human papillomavirus typesxe2x80x9d, Curr. Top. Microbiol. Immunol. 186: 131-156; and de Villiers, E. (1994). xe2x80x9cHuman pathogenic papillomavirus types: an updatexe2x80x9d, Curr. Top. Microbiol. Immunol. 186: 1-12.
What is apparent from consideration of the art discussed above is that determination of HPV type using hybridization probes requires a substantial arsenal of distinct probes types, and a battery of tests which makes HPV typing by this approach both time consuming and expensive. Furthermore, since the number of identified types of HPV is continuing to expand, there is a need to keep developing new tests and reagents and a risk that an existing hybridization probe is in fact unable to distinguish between a known genotype and a yet-to-be characterized genotype. Thus, it would be advantageous to perform the genotyping of HPV samples using reagents that are not type-specific. It is an object of the present invention to provide such a method.
This and other objects of the invention are achieved using a method for determining the sequence of human papillomavirus present in a sample comprising the steps of:
(a) amplifying a portion of the L1 open reading frame of human papillomavirus genome to form L1 amplicons containing plus and minus amplified strands using first and second amplification primers; and
(b) determining the positions of at least one species of nucleotide within at least one of the plus and minus amplified strands by extension of a sequencing primer which hybridizes with the plus or minus amplified strand in the presence of a chain-terminating nucleotide,
wherein the first amplification primer has the sequence given by Seq. ID. No. 1, and the sequencing primer has the sequence given by Seq. ID No. 3. The second amplification primer preferably has the sequence given by sequence ID No. 2.
The present invention provides a method for sequencing, and thus for determining the genotype of HPV that may be present in a sample. Suitable samples for use in the present invention include but are not limited to cervical swabs or scrapings, urethral swabs, vaginal/vulval swabs, urine and biopsied tissues samples.
In accordance with the present invention, a sample containing, or suspected of containing HPV is combined with a pair of amplification primers effective to amplify a portion of the L1 open reading frame of the HPV genome via polymerase chain reaction (PCR) amplification. The procedures for PCR amplification have become well known, and will not be repeated at length here. Basically, however the two primers are selected to flank a region of interest to be amplified, one primer binding to each of the strands of the DNA duplex such that template-dependent primer extension proceeds in the direction of the other primer binding site. Repeated cycles of annealing, extension and denaturation result in the production of many copies of both the plus and minus (sense and antisense) strands in the region flanked by the primers. The double stranded copies of the L1 region are referred to herein as L1 amplicons. Each such amplicon of course contains a plus and a minus strand.
Consensus amplification primer sequences for the L1 open reading frame of HPV have been previously described in U.S. Pat. No. 5,447,839 and in Ting et al., xe2x80x9cDetection and Typing of Genital Human Papillomavirusxe2x80x9d, PCR Protocols: A Guide to Methods and Applications, Academic Press, 1990, pp. 356-367. These primers, designated as MY11 and MY09, respectively, have the following sequences:
MY11, positive strand primer:
CGCMCAGGGWC ATAAYAATGG Seq. ID. No. 1
MY09, negative strand primer:
CGTCCMARRG GAWACTGATC Seq. ID No. 2.
A third primer, HMB01 (SEQ ID No. 4) is often used in combination with MY09 and MY11 to amplify HPV 51 which is not amplified efficiently with MY09 and MY11 alone. Hildesheim et al., J. Infect. Dis. 169: 235-240 (1994). This amplification primer, or other additional primers which increase amplification efficiency for difficult types may be included in amplification mixtures when practicing the present invention. See Qu et al. (1997) J. Clin. Microbiol. 35: 1304-1310. In a preferred embodiment of the present invention, the MY11, MY09 and HMB01primers are used to amplify HPV that may be present in the sample to be tested. This results in the production of L1 amplicons.
The next step in the method of the invention is the determination of the nucleic acid sequence of at least the minus strand of the L1 amplicons. This is accomplished using a chain termination sequencing method and a sequencing primer having the sequence:
ARRGGAWACT GATCWARDTC Seq. ID No. 3.
Like PCR, chain termination nucleic acid sequencing is a well known procedure, although many variations have been developed. In the basic procedure for chain-termination sequencing, a polynucleotide to be sequenced is isolated, rendered single stranded if necessary, and placed into four vessels. In each vessel are the necessary components to replicate the DNA strand, i.e., a template-dependant DNA polymerase, a short primer molecule complementary to a known region of the DNA to be sequenced, and the standard deoxynucleotide triphosphates (dNTP""s) commonly represented by A, C, G and T, in a buffer conducive to hybridization between the primer and the DNA to be sequenced and chain extension of the hybridized primer. In addition, each vessel contains a small quantity of one type (i.e., one species) of dideoxynucleotide triphosphate (ddNTP), e.g. dideoxyadenosine triphosphate (ddA).
In each vessel, the primer hybridizes to a specific site on the isolated DNA. The primers are then extended, one base at a time to form a new nucleic acid polymer complementary to the isolated pieces of DNA. When a dideoxynucleotide triphosphate is incorporated into the extending polymer, this terminates the polymer strand and prevents it from being further extended. Accordingly, in each vessel, a set of extended polymers of specific lengths are formed which are indicative of the positions of the nucleotide corresponding to the dideoxynucleotide in that vessel. These sets of polymers are then evaluated using gel electrophoresis to determine the sequence.
In principle, any oligonucleotide primer which binds to a target DNA sequence can be used as the sequencing primer in this process to produce sequencing fragments for analysis. In practice, however, different primers provide quite different results. Some primers produce results with a substantial amount of xe2x80x9cbackground,xe2x80x9d i.e., undesirable noise or unknown signals included in the sequencing trace. Such signals may result from non-specific binding of primers to undesired regions of the sample or other unknown sources or contaminants which create undesired extension products from the amplification and sequencing steps. The undesired products may have similar lengths to the sequencing products and therefore their bands may overlap on a sequencing gel. In addition, some primers allow the sequencing of only limited portions of an amplified strand or give rise to xe2x80x9chard stopsxe2x80x9d in the sequencing results. Others allow sequencing of long regions of the same amplicon, without hard stops. It is difficult, if not impossible to predict which primers will perform well, and which will perform poorly.
The sequencing primer of the present invention (Seq. ID. No. 3) is the culmination of a series of experiments to identify a sequencing primer which could be used to efficiently sequence the L1 amplicon produced by the MY11/MY09 primers. It was desirable to have a primer which would sequence at least 250 to 300 bases consistently, and which was not prone to hard stops or other anomalies such as background in the sequencing fragments. The sequencing primer of the invention meets these criteria. Other primers which were evaluated do not.
We first tried using the sense (positive strand) primer MY11, labeled with Cy5.5 fluorophore for sequencing. This primer produced only poor quality sequencing results with high background and hard stops after about 100 bases. We next tried a nested primer based on MY11 called MY11-3 (shifted by three bases in the 3xe2x80x2 direction relative to MY11). This decreased the background and improved the sequence quality, but we still observed hard stops in most of the sequencing runs. Next, we tried an antisense primer MY09 labeled with Cy5.5. This primer yielded much improved sequence without the hard stops. Sequence lengths of over 300 bases were obtained, but there was still a background with this primer. The primer of the invention, is based on MY09 namely, MY09-6 which is a degenerate sequencing primer shifted by 6 bases in the 3xe2x80x2 direction relative to MY09 (SEQ ID No. 3). This primer gives much better sequence without the background yielding over 300 bases called in most runs. An additional sequencing primer, which is HPV51-specific, may be included with the MY09-6 primer for the sequencing reaction to work with HPV51 isolates. The 14PV51-specific primer sequence is:
5xe2x80x2-AAT GAC AAT TGG TCT AAA TC-3xe2x80x2 SEQ ID No. 5
Because the sequencing primer of the invention hybridizes just inside the end of the plus strand of the L1 amplicon that is defined by the MY09 amplification primer, it will be appreciated that the sequence of the second primer is not critical to the invention. Thus, while the MY11 primer is preferred as the second amplification primer, other amplification primers that hybridize in a non-type-specific manner in the vicinity of the L1 region of the HPV genome can be used in combination with MY09. Examples of such primers might include variations of MY11 in which one or more bases is deleted from or added to the ends. Additional bases may be complementary to the HPV sequence or may be selected to introduce selected functionality to the amplified product. For example, a primer such as MY11 could be modified at the 5xe2x80x2-end of the sequence to add a complementary sequence to the M13 primer, thus permitting M13 sequencing primer to be used for sequencing in the reverse direction.
The method of the present invention can be performed where the amplification and sequencing reactions are discrete steps in which the L1 region of the HPV genome is first amplified and then, after optional purification, the sequence of one strand is determined. In this case, it may be desirable to include a capture-label such as biotin on one of the amplification primers. This would permit capture of the duplex DNA product after the final amplification cycle on an avidin or streptavidin coated support (for example avidin-coated magnetic beads), and washing to remove the amplification reagents such as including unreacted primer. One strand of the DNA would then be eluted from the support to provide either a solution with the strand to be sequenced or a support with the strand to be sequenced immobilized thereon ready for sequencing.
Sequencing of the amplicon can be done using conventional sequencing in which one cycle of primer annealing, extension and denaturation are performed. Sequencing may also be done using a multiple cycle sequencing technique such a xe2x80x9ccycle sequencing.xe2x80x9d As described by Kretz et al. in PCR Methods and Applications, Cold Spring Harbor Laboratory Press 1994, pp. S107-S11, cycle sequencing involves combining a sequencing primer with a template and processing the template through multiple cycles (e.g., about 30 cycles) of thermal conditions adapted for denaturation, primer annealing and primer extension using a thermostable polymerase enzyme such as Taq polymerase.
As an alternative to the performance of the amplification and sequencing reactions as discrete steps, a combined process of the type described generally by Ruano in U.S. Pat. No. 5,427,911 which is incorporated herein by reference . In this method, some number of initial amplification cycles (e.g, 15-20) are performed using the amplification primer pair, including at least primer MY09 (Seq. ID No. 2). Then, the sequencing primer of the invention (Seq. ID No. 3) and a chain-terminating nucleotide triphosphate are added to the amplification mixture and some number of additional cycles (e.g., 15-20) to produce sequencing fragments for analysis. Sequencing procedures such as those disclosed in commonly assigned U.S. Pat. No. 5,888,736, which is incorporated herein by reference, may also be employed.
The method of the invention may be used to explicitly determine the positions of all four species of nucleotide triphosphates by carrying out sequencing reactions in which chain-terminating nucleotides corresponding to each of the four types of bases are used. As explained in U.S., Pat. No. 5,834,189 and International Patent Publication No. WO 97/20202, which are incorporated herein by reference, however, the explicit determination of all of the bases is not always necessary for genotyping virus with known sequences. In the case of HPV, determination of the positions of the A bases within the L1 region allows genotyping of all known oncogenic genotypes.
In order to detect the sequencing fragments, it is generally necessary to incorporate a detectable label into the fragments. Such labels can be, for example, radiolabels, chromophores or chromogenic labels, or fluorescent or fluorogenic labels. Preferred labels are fluorescent labels suitable for detection in existing DNA sequencing instrumentation, including fluorescein, Texas Red X, carboxy-X-rhodamine, carboxyfluorescein, carboxytetramethylrhodamine, carboxycyanine 5.0 (Cy5.0), and carboxycyanine 5.5 (Cy5.5).
The detectable label is preferably affixed to the sequencing primer of the invention (Seq. ID No. 3). Labels may also be affixed to the chain terminating nucleotide triphosphate or to bases which will be incorporated in the extending chain.