This invention relates to a diagnostic method for detecting genetic changes associated, inter alia, with the development of cancer. The method detects allelic imbalance (AI), such as loss of heterozygosity (LOH), in nucleic acid from an individual and can be applied to samples in which only a small proportion of the cells display AI, thereby allowing the early identification of mutations which lead to the progression of cancer. The invention also relates to amplification primers for use in the method and to diagnostic kits containing them.
The term xe2x80x9callelic imbalancexe2x80x9d refers to the chromosomal loss or gain of a region of a chromosome when the partner chromosome (in a diploid cell) is unaltered. Allelic imbalance is typically found in tumour cells. Allelic imbalance may be due to selective loss of a region of DNA derived from a single chromosome and is referred to as loss of heterozygosity (LOH) when the partner chromosome is varied in some form, most commonly by microsatellite size. Other causes of AI include: gene amplification (i.e. myc oncogene may be amplified 5-10 fold), heteroplasmy or, at the transcript level, differential allelic expression (i.e. maternal allele vs paternal allele).
Heteroplasmy most commonly refers to imbalanced ratios of maternal mitochondrial DNA and may be associated with specific pathogenic effects. The phenotype is dependent not just on homo/heterozygosity, but the relative frequencies of both xe2x80x9callelesxe2x80x9d. The inheritance of allele sequences does not occur in a Mendelian manner, and the ratio of maternal xe2x80x9cmutantxe2x80x9d and maternal xe2x80x9cWTxe2x80x9d alleles can vary between tissue/cell types. Heteroplasmy may be involved in a wide range of pathological conditions (eg mitochondrial, cardiac and encephalic myopathies) including a possible role in the pathology of Parkinson""s and Alzheimer""s disease (Suomalainen A, Annals of Medicine. 29:235-246, 1997).
Loss of a particular region of a chromosome is a frequent event in the development and progression of cancer. Different types of tumour have been found to exhibit loss of different chromosomal regions, strongly suggesting that the regions contain genes which are essential for the prevention of neoplasia. These genes, called tumour suppressor genes, function primarily in the regulation of the cell cycle. Inactivation of tumour suppressor genes is believed to be one of the earliest cellular events which lead to the development of cancer.
Identification of genetic changes involved in the development of cancer is a promising area for the early diagnosis of the disease, and improved techniques for detecting these genetic changes are required.
The loss of one variant form of the two copies of a given DNA, or potentially, transcribed RNA sequence, normally present in diploid cells is defined herein as loss of heterozygosity (LOH). LOH reflects part of the process which ultimately results in the inactivation of tumour suppressor genes. Somatic mutation, defined herein as acquired small DNA sequence changes such as point mutation, small insertions or deletions, frequently results in the inactivation of the remaining functional allele.
Detection of such alterations in nucleic acid sequence may be difficult for two reasons. Firstly, the affected cells may be very rare within the tissue sample of interest. Secondly, the exact position of the mutation within this sequence of interest may not easily be predicted, thus requiring detailed and time consuming analysis, a process incompatible with the development of a reliable and cost effective diagnostic test. Although several techniques have been used to detect LOH as a diagnostic indicator of cancer, available methods suffer from a number of disadvantages and generally lack sensitivity. As a consequence, there is a significant need for more sensitive methods for the identification of somatic nucleic acid sequence changes.
A fundamental requirement for any method to detect AI is a procedure to distinguish between the two alleles (chromosomal regions of interest) within a cell. Techniques to achieve this fall into two main categories, cytogenetic and molecular.
Cytogenetic techniques directly visualise, by microscopy based techniques, the loss of a chromosome or a chromosomal region within a cell. A normal cell should contain two copies of each autosome one of maternal origin and one of paternal origin. The principal requirement to detect AI (such as LOH) cytogenetically is a probe which will specifically hybridise to the particular chromosome or region of interest. The identification of AI is achieved by counting the number of signals associated with each nucleus using microscopy.
In contrast, molecular techniques analyse a population of cells and require a probe or marker which differs between the maternal and paternal alleles. Traditionally, microsatellite markers have been used for this purpose. As long as a chromosome pair differs in the number of repeat units at the relevant microsatellite then loss of one of the chromosomes is indicated by loss of a microsatellite of the appropriate size.
With AI detection, in a situation where all of the cells in the sample to be analysed contain the alteration then the analysis is fairly straightforward, with the proviso that the region of interest is either large enough to allow cytogenetic analysis or appropriate sequence information is available for molecular analysis. Indeed, with substantially homogeneous populations of tumour cells, simple one round heteroduplex analysis has been proposed as a means of detecting the LOH at tumour suppressor gene loci (Mansukhani et al (1997, Diag. Mol. Pathol., 6, 229-237). However, AI analysis is much more difficult in mixed samples, i.e. those in which only a proportion of the cells comprising the sample display AI. For example, if only 5% of the cells in a population display LOH of, say the maternal allele, then the ratio of paternal to maternal alleles in the sample would be 51:49. It is often the case that the clinical samples available for analysis in cancer diagnosis are not homogeneous and the proportion of cells containing any given mutation can vary from 100% to less than 1%.
Existing techniques are not well suited to LOH analysis in mixed populations. Although cytogenetic techniques have the potential to detect LOH in a small proportion of a sample, in practice they are technically difficult and not well developed to routine clinical use, and are limited to the detection of relatively large sequence changes. Currently available molecular techniques are more amenable to clinical application but they are not able to detect small changes in the relative level of maternal to paternal alleles within an under represented population. Mansukhani et al (1997, Diag. Mol. Pathol., 6, 229-237) developed a method based on the formation of heteroduplexes between PCR amplified alleles which could be used to identify recurrent mutations in the BRCA1 and BRCA2 genes. However, the analytical sensitivity of the method was very low, severely limiting its use in clinical diagnosis. Indeed, the authors report that use of heteroduplex analysis for detecting the loss of the remaining allele is only possible if the sample contains no more than 3-10% normal issue.
Similarly, the sensitivity of other currently available molecular techniques (i.e. microsatellite LOH analysis) is such that when the rarer population, i.e. the population with LOH, is less than about 25% of the total sample, LOH is not easily detectable.
Denaturing high performance liquid chromatography (DHPLC) has been shown recently to be a useful method for detecting single nucleotide polymorphisms and inherited mutations by detecting heteroduplex DNA (Liu et al. Nucleic Acids Research. 26(6):1396-1400, 1998; O""Donovan et al. Genomics 52:44-49, 1998). U.S. Pat. No. 5,795,976 also discloses a method for separating heteroduplex and homoduplex molecules in a mixture using high performance liquid chromatography. Separation of heteroduplexes and homoduplexes by DHPLC is particularly useful in the method of the present invention.
The present invention is a novel molecular technique which can detect AI in nucleic acid from an individual by measuring the change in the ratio of allelic variants (of maternal or paternal origin for example) after one or more rounds of heteroduplex formation and removal. The method of the invention amplifies the initial difference between the levels of allelic variants, greatly increasing the sensitivity of the assay. The method can thus be used to detect allelic imbalance when only a small proportion of the cells have mutated, allowing early identification of the genetic changes which lead to the development of cancer.
Therefore in a first aspect of the invention we provide a method for detecting allelic imbalance (AI) in sample nucleic acid, which method comprises providing multiple copies of a target nucleic acid region present in the sample, the target nucleic acid region comprising a marker of heterozygosity, separating the multiple copies into individual strands then allowing the individual strands to reanneal under conditions which permit the formation of homoduplexes and heteroduplexes, removing any heteroduplexes so formed, subjecting the remaining homoduplexes to the above steps of separation, reannealing and heteroduplex removal one or more times so that any difference in the initial ratio of allelic variants is amplified, and detecting the presence or absence of AI by reference to any difference in allele ratio so detected.
The method of the invention enhances AI detection by amplifying the difference between the levels of variant forms derived from maternal and paternal chromosomes, or in the case of heteroplasmy the relative ratio of maternally inherited mitochondrial DNA, in the sample. This is achieved by first allowing molecules derived from each allelic variant to anneal to each other and then using a technique which will remove heteroduplexes formed from the annealing of the nucleic acid derived from both alleles. By repeating the process of heteroduplex formation and removal one or more times, the net effect is to steadily enrich (in relative not absolute levels) the sample for whichever allele is over-represented in the original nucleic acid population.
At the start of the process the levels of the two alleles may differ only slightly but for example, after several rounds of enrichment the allele which was slightly over-represented at the start of the process may form  greater than 95% of the total. By measuring the change in the ratio of the variant alleles during the enrichment procedure, the level of AI in the original sample can be determined.
Conveniently, the enrichment cycle of strand separation, reannealing and heteroduplex removal will be repeated 1-15 times. In a preferred aspect of the invention, the enrichment cycle will be repeated 2-10 times and in a most preferred aspect of the invention the enrichment cycle will be repeated 3-5 times.
The method of the invention uses any suitable difference between the variant alleles (i.e. between the maternal and paternal alleles) which can unambiguously differentiate between the two chromosomal regions but will still allow heteroduplexes to form between them. This difference in nucleotide sequence is often referred to as a marker of heterozygosity. Ideally, the sequence change is a polymorphism which may take the form of a single nucleotide difference or a small insertion or deletion, for example 1-10 nucleotides in length.
Allelic imbalance may be used as a marker for acquired DNA changes which underlie tumour formation. The method of the invention is therefore particularly useful in cancer management, including diagnosis, pre-symptomatic disease detection (screening), molecular staging and therapy monitoring.
A preferred target region of interest is the APC gene (adenomatous polyposis coli gene) located on chromosome 5q (5q21), a tumour suppressor gene which has been strongly implicated in the development of colorectal cancer. Other preferred regions of interest are the DCC gene (deleted in colorectal cancer gene) located on chromosome 18q; the tumour suppressor gene p53 located on chromosome 17p (17p13); the mannose 6-phosphate/insulin-like growth factor 2 receptor tumour suppressor gene located on chromosome 6q (6q26-27), (see Oates et al., Breast Cancer Res Treat. 47(3):269-81, 1998 and De Souza et al., Oncogene.10(9):1725-1729, 1995); and the tumour suppressor gene p16 located on chromosome 9p (9p21). Table 1 provides a non-comprehensive list of tumour suppressor genes, their chromosomal locations and types of tumours associated with AI to the genes. Mutations within these genes or at these chromosomal locations have been well documented. AI amongst these and other tumour suppressor genes can be detected using the method described herein.
A particularly preferred target region of interest is the region of human chromosome 10q bounded by DNA defined by the markers D10S541 and D10S215, which contains the tumour suppressor gene PTEN (PCT Application WO97/15686, Imperial Cancer Research Technology Ltd.).
As mentioned above, another cause of AI is amplification, particularly of oncogenes. Amplification represents one of the major molecular pathways through which the oncogenic potential of proto-oncogenes is activated during tumourigenesis (Schwab. BioEssays. 20:473-79, 1998). The following are examples of proto-oncogenes that are often amplified resulting in AI, and thus (provided they contain a marker of heterozygosity), are detectable according to he method of this invention: MYC, ABL, RASK, RASW, MYB, ERBA, ERBB2 (also known as HER2 or NEU), MYCN and MYCL (see Schwab and Amler. Genes Chromosom. Cancer. 1:181-193, 1990; and Schwab. BioEssays. 20:473-479, 1998).
AI is detected in nucleic acid extracted from a clinical tissue or fluid specimen by measuring the change in the ratio of inherited alleles after one or more rounds of heteroduplex formation and removal. The invention therefore provides a method for detecting AI in nucleic acid extracted from a clinical sample, comprising providing multiple copies of a target nucleic acid region present in the sample, the target region comprising a marker of heterozygosity, and measuring the change in the ratio of allelic variants of said marker of heterozygosity after one or more rounds of heteroduplex formation and removal.
The sample nucleic acid from an individual may be either genomic DNA or cDNA generated from mRNA by reverse transcription. The sample nucleic acid is preferably one isolated from an animal, preferably a human tissue or fluid sample. Such a sample may conveniently be from a solid tissue, such as from a tumour or tumour margin, or other biopsy sample, or from a stool sample or bodily fluid sample (such as, sputum, saliva, blood, semen, urine and the like). The sample may be fresh or one preserved by for example, freezing, formalin, or other tissue fixation methods, and may then optionally be embedded in paraffin or the like. The sample nucleic acid may be one indirectly obtainable from the sample, i.e. one prepared by amplification (such as PCR) from the original sample.
Multiple copies of the target region of interest containing a marker of heterozygosity can be obtained by amplification or any convenient enrichment procedure.
Amplification of the target region of interest may be achieved using any convenient technique such as the polymerase chain reaction (PCR) (U.S. Pat. Nos. 4,683,195 and 4,683,202 Roche). The method of the invention can be used to separate and detect the existence of a single base mismatch in a DNA duplex containing up to about 2000 base pairs. The preferred length of target DNA region is between 30 and 1000 bp, more preferably between 50 and 500, most preferably between 100 and 150 bp. Because clinical tissue specimens such as paraffin embedded tissue biopsies are often found to be partially degraded, it can be technically difficult to amplify large fragments. Consequently, amplified PCR products from such samples should preferably be about 50-250 bp in size. Most preferably, the amplified PCR products should be about 100-150 bp in size. However, future development of improved clinical procedures for sampling and preserving tissue specimens may be expected to relax this size restriction and there is believed to be no theoretical limit to the preferred size range of the amplified products.
Other procedures which may conveniently be used to enrich the region of interest include magnetic Dynabeads (Dynal(copyright), Norway) or DARAS(copyright) capture probe cartridges (Tepnel Life Sciences Ltd., UK).
With DHPLC separation of the molecular species (heteroduplex and homoduplex) no prior purification of the two molecular species would be required as unincorporated amplification reaction components (e.g. primers) would be eluted in different fractions. With other separation techniques, such as enzymatic, the amplified regions of interest may optionally be purified by any convenient method. A preferred method of purification is to degrade any remaining, unincorporated amplification primers by treatment with exonuclease I.
Duplexes are formed by separating and reannealing the amplified regions of interest. Typically, this will be achieved by heat denaturing a solution containing the amplified regions of interest, followed by cooling to allow the melted DNA strands to reanneal. However, it is not intended that the method of the invention should be restricted to thermal techniques, and any convenient method for duplex separation and formation may be used.
Heat denaturation is conveniently carried out by subjecting the nucleic acid sample to temperatures around 95xc2x0 C., for example between 92xc2x0 C. and 100xc2x0 C., for a duration sufficient to ensure strand separation, nominally at least 1 minute, usually between 2 and 10 minutes. Annealing is generally carried out by allowing the temperature of the denatured solution to drop to 37xc2x0 C. over a period of 1 to 2 min. A more gradual cooling rate of 1-4xc2x0 C. per minute may be preferred. The optimum denaturing temperature and annealing rate will depend on the duplex composition and length. The optimum temperatures and times required to ensure denaturation and annealing can be determined by the person skilled in the art.
The method assumes that the relative number of heteroduplexes and homoduplexes formed will depend only on the relative frequency of the two alleles. In practice, it is possible that homoduplexes may form in preference to heteroduplexes because of the greater binding affinity of the perfectly matched sequences. The introduction of a thermostable region into the product so that any potential annealing bias due to the mismatch will be overpowered by the thermostable region should ensure random annealing of nucleic acid molecules. Therefore, in a preferred aspect of the invention the method will incorporate a technique to promote the formation of stable heteroduplexes. For example, a GC clamping sequence may be incorporated into the design of the PCR primers used to amplify the nucleic acid regions of interest. Also, allelic variants of minimal primary sequence difference may be preferential to large primary sequence differences. Single nucleotide polymorphisms are preferred as these will have the least influence on heteroduplex formation.
Heteroduplexes may be removed from the reaction mixture by any convenient method, for example, physical, enzymatic or chemical mismatch cleavage, or mismatch binding.
In a preferred aspect of the invention, heteroduplexes are removed by binding to prokaryotic or eukaryotic mismatch binding proteins. An example is MutS, a mismatch binding protein isolated from E. coli, which recognises regions of double-stranded DNA containing a single mismatched base pair (Wagner el al., 1995, Nucleic Acids Research, 22, 1541-1547). MutS is allowed to bind to the heteroduplexes and bound heteroduplex/MutS complexes are removed from the reaction mixture using, for example, powdered nitrocellulose. A convenient alternative is to use MutS conjugated to magnetic beads, allowing bound heteroduplexes to be removed from the reaction mixture with a magnet. MutS may also be conjugated to biotin and the bound heteroduplexes removed from the mixture using streptavidin coated beads.
In another preferred aspect of the invention mammalian or bacterial endonucleases are used to recognise and cleave the heteroduplexes at mismatched bases (see U.S. Pat. No. 5,824,4710). Examples of preferred enzymes include bacteriophage resolvases such as T4 endonuclease VII or T7 endonuclease I. In a particularly preferred aspect of the invention, thermostable cleavage enzymes would be used in order to avoid the necessity of adding fresh enzyme during each round of heteroduplex formation and removal.
The most preferred method of separating the heteroduplex and homoduplex molecules involves physical separation, such as achieved by chromatography or electrophoresis. Suitable examples include, denaturing high performance liquid chromatography (DHPLC) and chemical or temperature denaturing electrophoresis. Denaturing HPLC is a chromatographic technique capable of separating heteroduplex and homoduplex DNA molecules in a mixture. The mixture is applied to a stationary reverse-phase support and the homo and heteroduplex molecules are eluted (under thermal or chemical conditions capable of partially denaturing heteroduplexes) with a mobile phase containing an ion-pairing reagent (e.g. triethylammonium acetate; TEAA) and an organic solvent (e.g. acetonitrile; AcN). DHPLC can also allow the direct quantitation of relative homoduplex and heteroduplex concentrations by the detection of ultraviolet absorbance or fluorescent emission of/from the separated species. The area under the absorbance/emission peak is proportional to the amount of product which therefore allows quantitative assessment of the relative proportions of each allele. DHPLC is described in Liu W et al. (Nucleic Acids Research. 26:1396-1400, 1998 and O""Donovan MC et al. Genomics. 52:4449, 1998).
A preferred method for use in the instant invention to separate heteroduplex and homoduplex molecules is as described in U.S. Pat. No. 5,795,976, incorporated herein by reference.
As mentioned above, certain physical separation techniques allow direct quantitation of the amounts of heteroduplexes and homoduplexes present after each round of separation annealing and heteroduplex removal. Quantitation of the relative frequency of the aa and bb homoduplexes remaining in the mixture after heteroduplex removal can also be analysed using any convenient mutation quantification technique.
One method is to take samples of the solution containing the homoduplexes and measure the ratio of the two alleles by PCR/ELISA using allele specific PCR primers end-labelled with haptens such as digoxigenin and dinitrophenol.
In a preferred aspect of the invention, analysis of homoduplex ratio is carried out using real-time PCR, comprising a detection system such as Molecular Beacons (as described in WO95/13399) or Scorpions(trademark) (as described in PCT/GB98/03521, Zeneca Ltd).
In a particularly preferred aspect of the invention, analysis of homoduplex ratio is carried out using real-time ARMS(trademark) allele specific amplification (as described in EP-0332435, Zeneca Ltd).
For any given mixture of two distinct alleles a and b, the relative frequency of homoduplex (aa and bb) and heteroduplex (ab) formation after denaturing and reannealing is defined by the equation:
a2+2ab+b2=1
where
a frequency of allele a,
b=frequency of allele b
a2=frequency of aa homoduplex
b2=frequency of the bb homoduplex
2ab=frequency of ab/ba heteroduplex
For example, if each allele is initially present at equal frequency, i.e. ratio of a:b is 0.5:0.5, then the relative frequency of homoduplexes and heteroduplexes after a single round of denaturing and reannealing is
0.52+2(0.5 0.5)+0.52=1
0.25+0.5+0.25=1 
In other words, homoduplexes and heteroduplexes are formed in the ratio aa:ab:bb of 1:2:1.
If alleles a and b are present at exactly equivalent frequencies the ratio of a to b will remain unchanged after any number of rounds of heteroduplex formation and removal. If however, one allele is under-represented, reflecting AI, such as LOH, then the rarer allele (a) will become increasingly less representative of the total DNA population in relation to the more frequent allele (b) with successive rounds of heteroduplex formation and removal. Therefore, the ratio of aa:bb will gradually decrease after each cycle of heteroduplex formation and removal (See Table 2 and graphical representation in FIG. 2).
The rate at which the ratio of aa:bb changes with successive rounds of heteroduplex formation and removal is governed by the initial aa:bb ratio in the original sample, so that even though the initial difference was too small to measure accurately, the amplified difference can be easily detected.
By plotting the rate of change of the aa:bb ratio and extrapolating backwards, the method provides an estimate of the level of AI in the original sample. In principle this technique should be able to detect AI when the difference in frequency of the two alleles is  less than 1%.
The following calculation demonstrates how the initial frequency of maternal and paternal alleles may be calculated for a clinical tissue specimen in which, for example, 5% of the cells display AI:
Consider a tissue sample in which 5% of the cells are tumour cells which display AI. The tissue sample will comprise 95% normal cells containing alleles a and b, and 5% tumour cells containing only allele b as a result of AI. Assume for the sake of argument, that the total number of cells present in the sample is 1000 cells. The normal cells will contain 950 a alleles and 950 b alleles and the tumour cells will contain 50 b alleles only. Thus the total number of b alleles in the sample is 1000 and the total number of a alleles in the sample is 950. The relative frequency of the two alleles (a:b) is thus 950:1000 which (after normalising by multiplying both sides by 1000/1950) is equivalent to a percentage frequency ratio of 49:51.
The present method requires individuals to be heterozygous at a polymorphic marker within the region of interest. Individuals will therefore have to be typed for the existence of heterozygosity within the test genetic locus. Such typing could of course be done using DHPLC analysis, or any other convenient method. In order to identify markers of heterozygosity at a locus in the region suspected of AI the target region of interest is analysed for allelic variation using a source of nucleic acid unaffected by AI. The source of nucleic acid is conveniently a blood sample, buccal swab or any other normal tissue obtained from an individual.
The target region of interest may optionally be amplified using any convenient technique, for example PCR.
The region of interest is then analysed for the presence of a marker of heterozygosity. A preferred method of analysis is to test the individual against a panel of markers of heterozygosity using the amplification refractory mutation system (ARMS). Other convenient methods include direct sequencing; cloning and sequencing; heteroduplex analysis methods such as denaturing gradient gel electrophoresis (DGGE) or DHPLC, and enzymatic or chemical mismatch cleavage; in situ hybridisation based methods such as FISH; comparative genome hybridisation (CGH); mini-sequencing; spectral karyotyping (SKY); and hybridisation based methods, including solid phase chip-based techniques. Each of these techniques is well known in the art. Many current methods for the detection of allelic variation (i.e. LOH) are reviewed by Nollau et al., Clin. Chem. 43, 1114-1120, 1997; and in standard textbooks, for example xe2x80x9cLaboratory Protocols for Mutation Detectionxe2x80x9d, Ed. by U. Landegren, Oxford University Press, 1996 and xe2x80x9cPCRxe2x80x9d, 2nd Edition by Newton and Graham, BIOS Scientific Publishers Limited, 1997.
The invention has a significant number of uses. These include identification of novel gene sequences; delineation of the sequence of acquired mutations required for neoplasia; including identification of early genetic events associated with the initiation and progression of cancer; early identification of disease caused by acquired genetic change, optionally in association with an inherited variation; identification of changes in the expression of alleles in relation to disease states/therapeutic interventions; identification of novel therapeutic intervention points in a disease process; and identification of gene amplification.
The amplification primers and detection polynucleotides used in the method of the invention may be conveniently packaged with instructions and appropriate packaging and sold as a kit. The kit may also comprise suitable endonucleases (i.e resolvases such as T4 endonuclease VII or T7 endonuclease I) or mismatch binding proteins (i.e Mut S), optionally conjugated to magnetic beads or other separable support, to facilitate enzymatic removal of heteroduplexes.