Molecular genetic mechanisms responsible for the development and progression of many cancers remain largely unknown. Identification of sites of frequent and recurring allelic deletion or gain is a first step toward identifying some of the important genes involved in the malignant process. Previous studies in retinoblastoma (Friend, et al. Nature, 323:643-6 (1986)) and other cancers (Cawthon, et al., Cell, 62:193-201 (1990); Baker, et al., Science, 244:217-21 (1989); Shuin, et al., Cancer Res, 54:2832-5 (1994)) have amply demonstrated that definition of regional chromosomal deletions occurring in the genomes of human tumors can serve as useful diagnostic markers for disease and are an important initial step towards identification of critical genes. Similarly, regions of common chromosomal gain have been associated with amplification of specific genes (Visakorpi, et al., Nature Genetics, 9:401-6 (1995)).
Comparative genomic hybridization (CGH) is a relatively new molecular technique used to screen DNA from tumors for regional chromosomal alterations (Kallioniemi, et al., Science, 258:818-21 (1992) and WO 93/18186). Unlike microsatellite or Southern analysis allelotyping studies, which typically sample far less than 0.1% of the total genome, a significant advantage of CGH is that all chromosome arms are scanned for losses and gains. Moreover, because CGH does not rely on naturally occurring polymorphisms, all regions are informative, whereas polymorphism-based techniques are limited by homozygous (uninformative) alleles among a fraction of tumors studied at every locus.
Increases in copy number in the long arm of chromosome 3, in particular 3q25-3qter, has been associated with cancer. Increases in copy number in this area have been seen not only in ovarian tumors (Iwabuchi et al., Cancer Research 55:6172-8180 (1995) but also in brain tumors, head and neck cancer, lung cancer, ductal breast cancer, renal cell and other urinary tract cancers, and cervical cancer. Ried et al., Genes Chromosomes Cancer 15:234-45 (1996); Yeatman et al. Clin Exp Metastasis 14:246-52 (1996); Brzoska et al., Cancer Res 15:3055-9 (1995); Ried et al., Cancer Res 54:1801-6 (1994); and Speicher et al., Cancer Res 55:1010-3 (1995).
The identification of narrower regions of genetic alteration or genes associated with cancers such as ovarian cancer would be extremely useful in the early diagnosis or prognosis of these diseases. The present invention addresses these and other needs.
The present invention provides compositions and methods for detecting genetic alterations correlated with cancer. The invention can be used to detect alterations in a 2 MB region at 3q26.3 that are associated with a number of cancers. Examples include ovarian cancer, brain cancer, lung cancer, head and neck tumors, renal cell and other urinary tumors, cervical cancer, and ductal breast cancer. The invention is particularly useful for detecting alterations associated with ovarian cancer.
The methods comprise contacting a nucleic acid sample from a patient with a probe which binds selectively to a target nucleic acid sequence on 3q26.3 correlated with cancer. In particular, the invention provides sequences from genes encoding the catalytic subunit of phosphatidylinositol kinase type 3 (PIK3CA) or the glucose transporter, GLUT2. The probes of the invention are contacted with the sample under conditions in which the probe binds selectively with the target nucleic acid sequence to form a hybridization complex. The formation of the hybridization complex is then detected. Typically, the number of regions of hybridization are counted. Abnormalities are detected as increases above normal in the regions of hybridization. In some embodiments, the methods of the invention further comprise detection of amplifications at 19q13.1-13.2. This region includes AKT2, a putative oncogene.
Alternatively, sample DNA from the patient can be fluorescently labeled and competitively hybridized against fluorescently labeled normal DNA to normal lymphocyte metaphases or to arrays of nucleic acid molecules which map to 3q26.3. Alterations in DNA copy number in the sample DNA are then detected as increases in sample DNA as compared to normal DNA at the 3q26.3 region.
Definitions
A xe2x80x9cnucleic acid samplexe2x80x9d as used herein refers to a sample comprising DNA in a form suitable for hybridization to a probes of the invention. The nucleic acid may be total genomic DNA, total mRNA, genomic DNA or mRNA from particular chromosomes, or selected sequences (e.g. particular promoters, genes, amplification or restriction fragments, cDNA, etc.) within particular cancer-associated amplifications. The nucleic acid sample may be extracted from particular cells or tissues. The tissue sample from which the nucleic acid sample is prepared is typically taken from a patient suspected of having the disease associated with the amplification being detected. The sample may be prepared such that individual nucleic acids remain substantially intact and typically comprises interphase nuclei prepared according to standard techniques. A xe2x80x9cnucleic acid samplexe2x80x9d as used herein may also refer to a substantially intact condensed chromosome (e.g. a metaphase chromosome). Such a condensed chromosome is suitable for use as a hybridization target in in situ hybridization techniques (e.g. FISH). The particular usage of the term xe2x80x9cnucleic acid samplexe2x80x9d (whether as extracted nucleic acid or intact metaphase chromosome) will be readily apparent to one of skill in the art from the context in which the term is used. For instance, the nucleic acid sample can be a tissue or cell sample prepared for standard in situ hybridization methods described below. The sample is prepared such that individual chromosomes remain substantially intact and typically comprises metaphase spreads or interphase nuclei prepared according to standard techniques.
The sample may also be isolated nucleic acids immobilized on a solid surface (e.g., nitrocellulose) for use in Southern or dot blot hybridizations and the like. In some embodiments, the probe may be a member of an array of nucleic acids as described, for instance, in WO 96/17958. In some cases, the nucleic acids may be amplified using standard techniques such as PCR, prior to the hybridization. The sample is typically taken from a patient suspected of having cancer associated with the abnormality being detected. xe2x80x9cNucleic acidxe2x80x9d refers to a deoxyribonucleotide or ribonucleotide polymer in either single- or double-stranded form, and unless otherwise limited, would encompass known analogs of natural nucleotides that can function in a similar manner as naturally occurring nucleotides.
xe2x80x9cSubsequencexe2x80x9d refers to a sequence of nucleic acids that comprise a part of a longer sequence of nucleic acids.
A xe2x80x9cprobexe2x80x9d or a xe2x80x9cnucleic acid probexe2x80x9d, as used herein, is defined to be a collection of one or more nucleic acid fragments whose hybridization to a target can be detected. The probe is typically labeled as described below so that its binding to the target can be detected. In some embodiments, the sample comprising the target nucleic acid is labeled and the probe is not labeled. For instance, when the probes are prepared as an array of nucleic acids which selectively bind a number of desired target sequences.
The probe is produced from a source of nucleic acids from one or more particular (preselected) portions of the genome, for example one or more clones, an isolated whole chromosome or chromosome fragment, or a collection of polymerase chain reaction (PCR) amplification products. The probes of the present invention are produced from nucleic acids found in the regions of genetic alteration as described herein. The probe may be processed in some manner, for example, by blocking or removal of repetitive nucleic acids or enrichment with unique nucleic acids. Thus the word xe2x80x9cprobexe2x80x9d may be used herein to refer not only to the detectable nucleic acids, but to the detectable nucleic acids in the form in which they are applied to the target, for example, with the blocking nucleic acids, etc. The blocking nucleic acid may also be referred to separately. What xe2x80x9cprobexe2x80x9d refers to specifically is clear from the context in which the word is used.
xe2x80x9cHybridizingxe2x80x9d refers the binding of two single stranded nucleic acids via complementary base pairing.
xe2x80x9cBind(s) substantiallyxe2x80x9d or xe2x80x9cbinds specificallyxe2x80x9d or xe2x80x9cbinds selectivelyxe2x80x9d or xe2x80x9chybridizing specifically toxe2x80x9d refers to complementary hybridization between a probe and a target sequence and embraces minor mismatches that can be accommodated by reducing the stringency of the hybridization media to achieve the desired detection of the target nucleic acid sequence. These terms also refer to the binding, duplexing, or hybridizing of a molecule only to a particular nucleotide sequence under stringent conditions when that sequence is present in a complex mixture (e.g., total cellular) DNA or RNA. The term xe2x80x9cstringent conditionsxe2x80x9d refers to conditions under which a probe will hybridize to its target subsequence, but to no other sequences. Stringent conditions are sequence-dependent and will be different in different circumstances. Longer sequences hybridize specifically at higher temperatures. Generally, stringent conditions are selected to be about 5xc2x0 C. lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH. The Tm is the temperature (under defined ionic strength, pH, and nucleic acid concentration) at which 50% of the probes complementary to the target sequence hybridize to the target sequence at equilibrium. Typically, stringent conditions will be those in which the salt concentration is at least about 0.02 Na ion concentration (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 60xc2x0 C. Stringent conditions may also be achieved with the addition of destabilizing agents such as formamide.
One of skill will recognize that the precise sequence of the particular probes described herein can be modified to a certain degree to produce probes that are xe2x80x9csubstantially identicalxe2x80x9d to the disclosed probes, but retain the ability to bind substantially to the target sequences. Such modifications are specifically covered by reference to the individual probes herein. The term xe2x80x9csubstantial identityxe2x80x9d of nucleic acid sequences means that a nucleic acid comprises a sequence that has at least 90% sequence identity, more preferably at least 95%, compared to a reference sequence using the methods described below using standard parameters.
Two nucleic acid sequences are said to be xe2x80x9cidenticalxe2x80x9d if the sequence of nucleotides in the two sequences is the same when aligned for maximum correspondence as described below. The term xe2x80x9ccomplementary toxe2x80x9d is used herein to mean that the complementary sequence is identical to all or a portion of a reference nucleic acid sequence.
Sequence comparisons between two (or more) nucleic acids are typically performed by comparing sequences of the two sequences over a xe2x80x9ccomparison windowxe2x80x9d to identify and compare local regions of sequence similarity. A xe2x80x9ccomparison windowxe2x80x9d, as used herein, refers to a segment of at least about 20 contiguous positions, usually about 50 to about 200, more usually about 100 to about 150 in which a sequence may be compared to a reference sequence of the same number of contiguous positions after the two sequences are optimally aligned.
Optimal alignment of sequences for comparison may be conducted by the local homology algorithm of Smith and Waterman Adv. Appl. Math. 2: 482 (1981), by the homology alignment algorithm of Needleman and Wunsch J. Mol. Biol. 48:443 (1970), by the search for similarity method of Pearson and Lipman Proc. Nati. Acad. Sci. (U.S.A.) 85: 2444 (1988), by computerized implementations of these algorithms.
xe2x80x9cPercentage of sequence identityxe2x80x9d is determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the nucleic acid sequence in the comparison window may comprise additions or deletions (i.e., gaps) as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. The percentage is calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the result by 100 to yield the percentage of sequence identity.
Another indication that nucleotide sequences are substantially identical is if two molecules hybridize to the same sequence under stringent conditions. Stringent conditions are sequence dependent and will be different in different circumstances. Generally, stringent conditions are selected to be about 5xc2x0 C. lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH. The Tm is the temperature (under defined ionic strength and pH) at which 50% of the target sequence hybridizes to a perfectly matched probe. Typically, stringent conditions will be those as described above.
As used herein, an xe2x80x9cantibodyxe2x80x9d refers to a protein consisting of one or more polypeptides substantially encoded by immunoglobulin genes or fragments of immunoglobulin genes. The recognized immunoglobulin genes include the kappa, lambda, alpha, gamma, delta, epsilon and mu constant region genes, as well as the myriad immunoglobulin variable region genes. Light chains are classified as either kappa or lambda. Heavy chains are classified as gamma, mu, alpha, delta, or epsilon, which in turn define the immunoglobulin classes, IgG, IgM, IgA, IgD and IgE, respectively.
The phrase xe2x80x9cspecifically binds to a proteinxe2x80x9d or xe2x80x9cspecifically immunoreactive withxe2x80x9d, when referring to an antibody refers to a binding reaction which is determinative of the presence of the protein in the presence of a heterogeneous population of proteins and other biologics. Thus, under designated immunoassay conditions, the specified antibodies bind to a particular protein and do not bind in a significant amount to other proteins present in the sample. Specific binding to a protein under such conditions may require an antibody that is selected for its specificity for a particular protein. For example, antibodies can be raised to the particular proteins disclosed here. Such antibodies will bind the proteins and not any other proteins present in a biological sample. A variety of immunoassay formats may be used to select antibodies specifically immunoreactive with a particular protein. For example, solid-phase ELISA immunoassays are routinely used to select monoclonal antibodies specifically immunoreactive with a protein. See Harlow and Lane (1988) Antibodies, A Laboratory Manual, Cold Spring Harbor Publications, New York, for a description of immnunoassay formats and conditions that can be used to determine specific immunoreactivity.