1. Field of the Invention
The present invention resides in the field of molecular genetics and diagnostics.
2. Description of Related Art
Virtually all substances introduced into the human body (xenobiotics) as well as most endogenous compounds (endobiotics) undergo some form of biotransformation in order to be eliminated from the body. Many enzymes contribute to the phase I and phase II metabolic pathways responsible for this bioprocessing. Phase I enzymes include reductases, oxidases and hydrolases. Among the phase I enzymes are the cytochromes P450, a superfamily of hemoproteins involved in the oxidative metabolism of steroids, fatty adds, prostaglandins, leukotrienes, biogenic amines, pheromones, plant metabolites and chemical carcinogens as well as a large number of important drugs (Heim and Meyer, Genomics 14, 49-58 (1992)). Phase II enzymes are primarily transferases responsible for transferring glucuronic acid, sulfate or glutathione to compounds already processed by phase I enzymes (Gonzales and Idle, Clin. Pharmacokinet. 26, 59-70 (1994)). Phase II enzymes include epoxide hydrolase, catalase, glutathione peroxidase, superoxide dismutase and glutathione S-transferase.
Many drugs are metabolized by biotransformation enzymes. For some drugs, metabolism occurs after the drug has exerted its desired effect, and result in detoxification of the drug and elimination of the drug from the body. Similarly, the biotransformation enzymes also have roles in detoxifying harmful environmental compounds. For other drugs, metabolism is required to convert the drug to an active state before the drug can exert its desired effect.
Genetic polymorphisms of cytochromes P450 and other biotransformation enzymes result in phenotypically-distinct subpopulations that differ in their ability to perform biotransformations of particular drugs and other chemical compounds. These phenotypic distinctions have important implications for selection of drugs. For example, a drug that is safe when administered to most human may cause intolerable side-effects in an individual suffering from a defect in an enzyme required for detoxification of the drug.
Alternatively, a drug that is effective in most humans may be ineffective in a particular subpopulation because of lack of a enzyme required for conversion of the drug to a metabolically active form. Further, individuals lacking a biotransformation enzyme are often susceptible to cancers from environmental chemicals due to inability to detoxify the chemicals. Eichelbaum et al., Toxicology Letters 64/65, 155-122 (1992). Accordingly, it is important to identify individuals who are deficient in a particular P450 enzyme, so that drugs known or suspected of being metabolized by the enzyme are not used, or used only with special precautions (e.g., reduced dosage, close monitoring) in such individuals. Identification of such individuals is also important so that such individuals can be subjected to regular monitoring for the onset of cancers.
Existing methods of identifying deficiencies are not entirely satisfactory. Patient metabolic profiles are currently assessed with a bioassay after a probe drug administration. For example, a poor drug metabolizer with a CYP2D6 defect is identified by administering one of the probe drugs, debrisoquine, sparteine or dextromethorphan, then testing urine for the ratio of unmodified to modified drug. Poor metabolizers (PM) exhibit physiologic accumulation of unmodified drug and have a high metabolic ratio of probe drug to metabolite. This bioassay has a number of limitations: lack of patient cooperation, adverse reactions to probe drugs, and inaccuracy due to coadministration of other pharmacological agents or disease effects. Genetic assays by RFLP (restriction fragment length polymorphism), ASO PCR (allele specific oligonucleotide hybridization to PCR products or PCR using mutant/wildtype specific oligo primers), SSCP (single stranded conformation polymorphism) and TGGE/DGGE (temperature or denaturing gradient gel electrophoresis), MDE (mutation detection electrophoresis) are time-consuming, technically demanding and limited in the number of gene mutation sites that can be tested at one time.
The difficulties inherent in previous methods are overcome by the use of DNA chips to analyze mutations in biotransformation genes. The development of VLSIPS(trademark) technology has provided methods for making very large arrays of oligonucleotide probes in very small areas. See U.S. Pat. No. 5,143,854, WO 90/15070 and WO 92/10092, each of which is incorporated herein by reference.
Microfabricated arrays of large numbers of oligonucleotide probes, called xe2x80x9cDNA chipsxe2x80x9d offer great promise for a wide variety of applications. The present application describes the use of such chips for inter alia analysis of polymorphisms and copy number variations in genes of interest, particularly, biotransformation genes, such as cytochromes P450.
The invention provides methods for determining the copy number of a gene present in an individual. In such methods, a plurality of polymorphic sites from an individual are analyzed and the number of different polymorphic forms present at each site is thereby determined. Gene copy number is then assigned as the highest number of polymorphic forms present at a single site. Typically, the polymorphisms on in the gene whose copy number is being determined or in flanking sequences, although the polymorphism can be present elsewhere provided they are on the same chromosome as the gene whose copy number is being determined. To illustrate, if a single polymorphic form is present at each of the plurality of sites, the copy number of the gene is assigned as 1. If two polymorphic forms are present at one site and a single polymorphic form is present at each other of the plurality of sites, the copy number of the gene is assigned as 2. If three polymorphic forms are present at a first polymorphic site, a single polymorphic form is present at a second polymorphic site and two polymorphic forms are present at a third polymorphic site and the copy number of the gene is assigned as 3.
Often some or all of the polymorphisms analyzed are silent polymorphisms. Such silent polymorphisms can be present in a noncoding segment of the gene, such as an intronic segment, or in sequences flanking the gene. The more polymorphisms analyzed, the more likely one is to obtain an accurate result. Typically, analysis of about 10 or 50 polymorphisms is sufficient. Nucleic acids for analysis are typically prepared by obtaining a tissue sample from the individual containing the gene and amplifying the gene or a fragment thereof.
Polymorphisms are typically analyzed using probe arrays. Such analysis can be performed by contacting a nucleic acid comprising the gene or a fragment thereof with an array of oligonucleotides, the array comprising a plurality of subarrays, each subarray spanning a polymorphic site and complementarity to at least one polymorphic form of the gene at the site. Hybridization intensities of the nucleic acid to the oligonucleotides in the array are then detected. The pattern of hybridization indicates the number of polymorphic forms present at each polymorphic site. In some methods, subarrays are subdivided into probe groups, with different probe groups comprising probes complementary to different polymorphic forms at a site. In some methods, probe groups are subdivided into two or more probe sets. A first probe set comprises a plurality of probes spanning a polymorphic site of the gene, each probe comprising a segment of at least six nucleotides exactly complementary to a polymorphic form of the gene at the site, the segment including at least one interrogation position complementary to a corresponding nucleotide in the polymorphic form. A second probe set comprises a corresponding probe for each probe in the first probe set, the corresponding probe in the second probe set being identical to a sequence comprising the corresponding probe from the first probe set or a subsequence of at least six nucleotides thereof that includes the at least one interrogation position, except that the at least one interrogation position is occupied by a different nucleotide in each of the two corresponding probes from the first and second probe sets. In some methods, third and fourth probe sets are also present. In such methods, the second, third and fourth probe sets, each comprise a corresponding probe for each probe in the first probe set, the probes in the second, third and fourth probe sets being identical to a sequence comprising the corresponding probe from the first probe set or a subsequence of at least six nucleotides thereof that includes the at least one interrogation position, except that the at least one interrogation position is occupied by a different nucleotide in each of the four corresponding probes from the four probe sets.
Often, the methods also analyze a phenotype-determining polymorphic site in the same gene as the polymorphisms used to determined copy number to determine which polymorphic form(s) are present at the site. This information can be used to diagnoses a phenotype of the patient based on the polymorphic form(s) present at the phenotype-determining polymorphic site.
In some methods, analysis of polymorphisms for determination of copy number and analysis of a phenotype-determining polymorphisms are performed using the same probe array. Such methods entail hybridizing a sample comprising a target nucleic acid comprising one or more alleles of the gene to an array of oligonucleotide probes immobilized on a solid support. Such an array comprises a first probe set comprising a plurality of probes, each probe comprising a segment of at least six nucleotides exactly complementary to a reference form of the gene, the segment including at least one interrogation position complementary to a corresponding nucleotide in the reference form of the gene, the reference form of the gene having a silent polymorphic site and a site of potential mutation associated with a phenotypic change. Such an array also contains a second, and often, third and fourth probe sets. The second, third and fourth probe sets, each comprise a corresponding probe for each probe in the first probe set, the probes in the second, third and fourth probe sets being identical to a sequence comprising the corresponding probe from the first probe set or a subsequence of at least six nucleotides thereof that includes the at least one interrogation position, except that the at least one interrogation position is occupied by a different nucleotide in each of the four corresponding probes from the four probe sets. The method entails determining which probes, relative to one another, bind to the target nucleic acid, whereby the relative binding of probes having an interrogation position aligned with the silent polymorphism indicates the number of different alleles of the gene in the sample and the relative binding of probes having an interrogation position aligned with the mutation indicates whether the mutation is present in at least one of the alleles.
The invention further provides arrays of probes immobilized on a solid support for analyzing biotransformation genes. In a first embodiment, the invention provides a tiling strategy employing an array of immobilized oligonucleotide probes comprising at least two sets of probes. A first probe set comprises a plurality of probes, each probe comprising a segment of at least three nucleotides exactly complementary to a subsequence of a reference sequence from a biotransformation gene, the segment including at least one interrogation position complementary to a corresponding nucleotide in the reference sequence. A second probe set comprises a corresponding probe for each probe in the first probe set, the corresponding probe in the second probe set being identical to a sequence comprising the corresponding probe from the first probe set or a subsequence of at least three nucleotides thereof that includes the at least one interrogation position, except that the at least one interrogation position is occupied by a different nucleotide in each of the two corresponding probes from the first and second probe sets. The probes in the first probe set have at least two interrogation positions corresponding to two contiguous nucleotides in the reference sequence. One interrogation position corresponds to one of the contiguous nucleotides, and the other interrogation position to the other. In this, and other forms of array, biotransformation genes of particular interest for analysis include cytochromes P450, particularly 2D6 and 2C19, N-acetyl transferase II, glucose 6-phosphate dehydrogenase, pseudocholinesterase, catechol-O-methyl transferase, and dihydropyridine dehydrogenase.
In a second embodiment, the invention provides a tiling strategy employing an array comprising four probe sets. A first probe set comprises a plurality of probes, each probe comprising a segment of at least three nucleotides exactly complementary to a subsequence of a reference sequence from a biotransformation gene, the segment including at least one interrogation position complementary to a corresponding nucleotide in the reference sequence. Second, third and fourth probe sets each comprise a corresponding probe for each probe in the first probe set. The probes in the second, third and fourth probe sets are identical to a sequence comprising the corresponding probe from the first probe set or a subsequence of at least three nucleotides thereof that includes the at least one interrogation position, except that the at least one interrogation position is occupied by a different nucleotide in each of the four corresponding probes from the four probe sets.
In a third embodiment, the invention provides arrays comprising first and second groups of probe sets, each group comprising first, second and optionally, third and fourth probe sets as defined above. The first probe sets in the first and second groups are designed to be exactly complementary to first and second reference sequences. For example, the first reference can include a site of mutation rendering the gene nonfunctional, and the second reference sequence can include a site of a silent polymorphism.
In a fourth embodiment, the invention provides a block of oligonucleotides probes (sometimes referred to as an optiblock) immobilized on a support. The array comprises a perfectly matched probe comprising a segment of at least three nucleotides exactly complementary to a subsequence of a reference sequence from a biotransformation gene, the segment having a plurality of interrogation positions respectively corresponding to a plurality of nucleotides in the reference sequence. For each interrogation position, the array further comprises three mismatched probes, each identical to a sequence comprising the perfectly matched probe or a subsequence of at least three nucleotides thereof including the plurality of interrogation positions, except in the interrogation position, which is occupied by a different nucleotide in each of the three mismatched probes and the perfectly matched probe.
In a fifth embodiment (sometimes referred to as deletion tiling), the invention provides an array comprising at least four probes. A first probe comprises first and second segments, each of at least three nucleotides and exactly complementary to first and second subsequences of a reference sequence from a biotransformation gene, the segments including at least one interrogation position corresponding to a nucleotide in the reference sequence, wherein either (1) the first and second subsequences are noncontiguous, or (2) the first and second subsequences are contiguous and the first and second segments are inverted relative to the complement of the first and second subsequences in the reference sequence. The array further comprises second, third and fourth probes, identical to a sequence comprising the first probe or a subsequence thereof comprising at least three nucleotides from each of the first and second segments, except in the at least one interrogation position, which differs in each of the probes.
In a sixth embodiment, the invention provides a method of comparing a target nucleic acid with a reference sequence from a biotransformation gene. The method comprises hybridizing a sample comprising the target nucleic acid to one of the arrays of oligonucleotide probes described above. The method then determines which probes, relative to one another, specifically bind to the target nucleic acid, the relative specific binding of corresponding probes indicating whether a nucleotide in the target sequence is the same or different from the corresponding nucleotide in the reference sequence.
For example, for the array of the second embodiment which has four probe sets, the array can be analyzed by comparing the relative specific binding of four corresponding probes from the first, second, third and fourth probe sets, assigning a nucleotide in the target sequence as the complement of the interrogation position of the probe having the greatest specific binding, and repeating these steps until each nucleotide of interest in the target sequence has been assigned.
In some methods, the reference sequence includes a site of a mutation in the biotransformation gene and a silent polymorphism in or flanking the biotransformation gene, and the target nucleic acid comprises one or more different alleles of the biotransformation gene. In this situation, the relative specific binding of probes having an interrogation position aligned with the silent polymorphism indicates the number of different alleles and the relative specific binding of probes having an interrogation position aligned with the mutation indicates whether the mutation is present in at least one of the alleles.