This invention relates to molecular biology, genetic diagnostics and array, or xe2x80x9cbiochip,xe2x80x9d technology. In particular, the invention provides computer systems, computer program products and methods for in silico array-based methods for determining the relative amount of biological molecules (e.g., nucleic acid sequences) in two or more samples. The invention also provides novel arrays comprising immobilized calibration molecules (e.g., nucleic acids) for normalizing the results of array-based binding assays (e.g., hybridization reactions).
Comparative genomic hybridization (CGH) was first developed for genome-wide analysis of DNA sequence copy number in a single experiment, see, e.g., Pinkel (1998) Nat. Genet. 20:207-211. Genomic DNA microarray based comparative genomic hybridization (CGH) has the potential to solve many of the limitations of traditional CGH method, which relies on comparative hybridization on individual metaphase chromosomes. CGH can be used to determine the relative copy number of nucleic acid sequences between two samples. CGH can also be used to precisely map chromosomal abnormalities associated with disease.
In metaphase CGH, multi-megabase fragments of different samples of genomic DNA are labeled and hybridized to a fixed chromosome. See, e.g., Breen (1999) J. Med. Genetics 36:511-517; Rice (2000) Pediatric Hematol. Oncol. 17:141-147. The CGH can compare known or normal DNA to a test sample, e.g., DNA from a possible tumor cell. Signal differences between known and test samples are detected and measured. In this way, missing, amplified, or unique sequences in the test sample, as compared to xe2x80x9cnormal,xe2x80x9d can be detected by the fluorescence ratio of normal control to test genomic DNA. In metaphase CGH, the target sites (on the fixed chromosome) are saturated by an excess amount of soluble, labeled genomic DNA.
In contrast to metaphase CGH, where the immobilized genomic DNA is a metaphase spread, in array-based CGH method the immobilized nucleic acids are arranged as an array, on, e.g., a biochip or a microarray platform. See, e.g., U.S. Pat. No. 5,830,645. Another difference is that in array-based CGH the immobilized genomic DNA is in molar excess as compared to the copy number of labeled (test and control) genomic nucleic acid. In array-based CGH, test and control, or xe2x80x9cnormal,xe2x80x9d nucleic acids are mixed together and applied to the array, or xe2x80x9cbiochip.xe2x80x9d In traditional CGH, because test and sample nucleic acids are mixed together before their application to the array they must be differentially labeled. Both the mixing together of samples and the use of different labels in the samples to be compared can result in artifacts and erroneous results.
The invention provides computer systems, computer program products and methods, including computer-implemented methods, for an array-based determination of the relative amount of biological molecules, e.g., nucleic acid sequences or polypeptides, in two or more samples.
The invention provides an in silico, array-based method for determining the relative amount of a biological molecule, e.g., a nucleic acid sequence, in two or more samples, the method comprising: (a) providing a first array comprising a plurality of nucleic acid segments, wherein each nucleic acid segment is immobilized to a discrete and known spot on a first substrate surface to form a first array of nucleic acid segments; (b) providing (at least) a second array comprising a plurality of nucleic acid segments, wherein each nucleic acid segment is immobilized to a discrete and known spot on a second substrate surface to form a second array of nucleic acid segments, and the nucleic acid segments immobilized on the second array comprise substantially the same plurality of nucleic acid segments arrayed in step (a); (c) providing a first sample comprising a plurality of nucleic acid sequences comprising a detectable label; (d) providing a second sample comprising a plurality of nucleic acid sequences comprising a detectable label; (e) contacting the first sample of step (c) with the first array of step (a) under conditions wherein the labeled nucleic acid can specifically hybridize to a nucleic acid segment immobilized on the first array; (f) contacting the second sample of step (d) with the second array of step (b) under the same conditions as in step (e), thereby allowing the labeled nucleic acid to specifically hybridize to a nucleic acid segment immobilized on the second array; (g) identifying which spots on the first and the second substrate surfaces are specifically hybridized to a labeled nucleic acid segment and measuring the amount of label on each spot; and, (h) comparing the amount of labeled nucleic acid sequence bound by specific hybridization to a nucleic acid segment immobilized on the first array to the amount of labeled nucleic acid sequence bound by specific hybridization to the same nucleic acid segment immobilized on the second array, thereby determining the relative amount of a nucleic acid sequence complementary to the same nucleic acid segment in the first sample compared to the second sample.
In alternative aspects, the biological molecule comprises a nucleic acid, e.g., an oligonucleotide, a lipid, a polysaccharide, a polypeptide (e.g., a peptide), or an analog or a mimetic thereof, or a combination thereof. The nucleic acid can comprise a DNA (e.g., a genomic DNA or a cDNA), an RNA (e.g., an mRNA, rRNA, and the like) or an analog or a mimetic thereof or a combination thereof. The nucleic acid can further comprise a telomeric structure or a chromatin structure. Analogs and mimetics can include small molecules, as discussed below. In alternative aspects, the array-immobilized nucleic acid segments comprise cloned genomic nucleic acid, cDNA, synthetic nucleic acid, and the like. The array-immobilized genomic nucleic acid can comprise a substantially complete chromosome or a known subset of a chromosome. The genomic nucleic acid can comprise a substantially complete genome or a known subset of a genome.
The array-immobilized nucleic acid can be derived from the transcripts of or from the genome of any cell, for example, a genotypically and/or phenotypically normal cell. The nucleic acid can be derived from a genome of a mammalian cell, such as a human cell.
As noted above, in the methods a plurality of biological molecules (e.g., nucleic acids) in at least two samples are labeled, e.g., sample nucleic acids comprise a detectable label. In one aspect, a plurality of labeled nucleic acid sequences comprises sequences the same or complementary to a subset or to substantially all of the transcripts expressed by a cell. The plurality of labeled nucleic acid sequences in one, several or all of the samples can comprise genomic sequences. In one aspect, the plurality of labeled nucleic acid sequences comprise a substantially complete chromosome or a known subset of a chromosome. The plurality of labeled nucleic acid sequences can comprise a substantially complete genome or a known subset of a genome. In one aspect, the array-immobilized nucleic acid comprises a substantially complete genome or a known subset of a genome and the plurality of labeled nucleic acid sequences from each sample comprise a substantially complete of the genome or a known subset of the genome, thereby the method is performing a comparative genomic hybridization (CGH).
In one aspect, biological molecules (e.g., nucleic acids) from the first sample are derived from a cell with a normal genotype and the nucleic acid from the second sample is derived from a cell with an abnormal genotype. Alternatively, the biological molecules (e.g., nucleic acids) from the first sample can be derived from a cell with a normal phenotype and the biological molecules (e.g., nucleic acids) from the second sample can be derived from a cell with an abnormal phenotype. The abnormal phenotype can comprise a disease phenotype or a neoplastic or hyperplastic phenotype. The neoplastic phenotype can be any cancer or neoplastic or hyperplastic condition, e.g., breast cancer, skin cancer, bone cancer.
In one aspect, the biological molecules (e.g., nucleic acids) from the first sample are derived from an unstimulated cell and biological molecules (e.g., nucleic acids) from the second sample are derived from an unstimulated cell after stimulation. The biological molecules (e.g., nucleic acids) from the first sample can be derived from an undifferentiated cell and the biological molecules (e.g., nucleic acids) from the second sample can be derived from the undifferentiated cell after stimulation. The biological molecules (e.g., nucleic acids) from the first sample can be derived from a normal cell and the biological molecules (e.g., nucleic acids) from the second sample can be derived from the normal cell after an injury. The biological molecules (e.g., nucleic acids) from the first sample can be derived from a normal cell and the biological molecules (e.g., nucleic acids) from the second sample can be derived from the normal cell after an environmental stress. The environmental stress can comprise a high or a low or a change in temperature. The environmental stress can comprise an exposure to a chemical, such as a carcinogen, a drug or a medicine.
In alternative aspects, the nucleic acid comprises a DNA, including a genomic DNA, cDNA, expressed sequence tags (EST), analogs or mimetics thereof, synthetic DNA and the like. The nucleic acid can comprise an RNA (e.g., an mRNA, rRNA, and the like) or an analog or a mimetic thereof or a combination thereof. In one aspect, an immobilized nucleic acid segment comprises nucleic acid, e.g., genomic DNA, cloned in a construct comprising an artificial chromosome. The artificial chromosome can comprise a bacterial artificial chromosome (BAC), a human artificial chromosome (HAC) a yeast artificial chromosome (YAC), a transformation-competent artificial chromosome (TAC) or a bacteriophage P1-derived artificial chromosome (PAC). The array-immobilized nucleic acid segment can be cloned in a construct comprising a vector, such as a cosmid vector, a plasmid vector, a phage or a viral vector. The array-immobilized nucleic acid segment can be between about 50 kilobases (0.5 megabase) to about 500 kilobases (5 megabases) in length, between about 100 kilobases (1 megabase) to about 400 kilobases (4 megabases) in length, or, is about 300 kilobases (3 megabases) in length.
In alternative aspects, labeled biological molecules (e.g., nucleic acids) are derived from a body fluid sample, a cell sample or a tissue sample. The labeled biological molecules (e.g., nucleic acids) can be derived from a cancer cell or a tumor cell sample. In alternative aspects, a labeled biological molecule (e.g., nucleic acid) in one sample is derived from a biopsy sample, a blood sample, a urine sample, a saliva sample or a CSF sample.
In one aspect, the method further comprises a washing step. In the washing step biological molecules (e.g., nucleic acids) not specifically bound (e.g., hybridized) to array-immobilized biological molecules (e.g., nucleic acids) are removed before the identifying step (g). The washing step can comprise use of a solution comprising a salt concentration of about 0.02 molar at pH 7 at a temperature of at least about 50xc2x0 C. The washing step can comprise use of a solution comprising a salt concentration of about 0.15 M at a temperature of at least about 72xc2x0 C. for about 15 minutes. The washing step can comprise use of a solution comprising a salt concentration of about 0.2xc3x97SSC at a temperature of at least about 50xc2x0 C. for at least about 15 minutes.
In one aspect, all sample biological molecules (e.g., nucleic acids) comprise the same label. For example, the first sample biological molecules (e.g., nucleic acids) and the second sample biological molecules (e.g., nucleic acids) can comprise the same label. Alternatively, first sample biological molecules (e.g., nucleic acids) and second sample (and third sample, etc.) biological molecules (e.g., nucleic acids) can comprise different labels. The sample biological molecules (e.g., nucleic acids) can be labeled with any detectable label, e.g., a fluorochrome, a chemiluminescent label, and equivalents. In alternative aspects, the detectable label comprises a fluorescent label, such as a Cy5(trademark) or equivalent, a Cy3(trademark) or equivalents; and, a rhodamine, a fluorescein or an aryl-substituted 4,4-difluoro-4-bora-3a, 4a-diaza-s-indacene dye or equivalents.
The methods of the invention can further comprise providing a third (or fourth, or fifth, etc.) array comprising a plurality of biological molecules (e.g., nucleic acids), wherein each biological molecules (e.g., nucleic acids) is immobilized to a discrete and known spot on a third (or fourth, or fifth, etc.) substrate surface to form a third (or fourth, or fifth, etc.) array of biological molecules (e.g., nucleic acids), and the biological molecules (e.g., nucleic acids) immobilized on the third (or fourth, or fifth, etc.) array comprise substantially the same plurality of biological molecules (e.g., nucleic acids) arrayed in step (a), and a third (or fourth, or fifth, etc.) sample comprising a plurality of biological molecules (e.g., nucleic acids) comprising a detectable label; contacting the third sample with the third array under the same conditions as in step (e), thereby allowing the labeled biological molecules (e.g., nucleic acids) to specifically bind (e.g., hybridize) to a biological molecules (e.g., nucleic acids) immobilized on the third (or fourth, or fifth, etc.) array; identifying which spots on the first, second and third (or fourth, or fifth, etc.) substrate surfaces are specifically bound (e.g., hybridized) to a labeled biological molecules (e.g., nucleic acids) and measuring the amount of label on each spot; and, comparing the amount of labeled biological molecules (e.g., nucleic acids) bound by specific binding (e.g., hybridization) to the same biological molecule (e.g., nucleic acid) immobilized on the first array, the second array and the third array (or fourth, or fifth, etc. arrays), thereby determining the relative amount of a biological molecule (e.g., nucleic acid) in the first, second and third (or fourth, or fifth, etc.) samples. In one aspect, the third, fourth, fifth, etc. arrays are contacted with duplicates or triplicates, etc., of biological molecule (e.g., nucleic acid) samples.
The methods can further comprise the step of blocking the ability of repetitive nucleic acid sequences to hybridize (i.e., blocking xe2x80x9chybridization capacityxe2x80x9d) in the immobilized nucleic acid segments. The methods can also further comprise the step of blocking the hybridization capacity of repetitive nucleic acid sequences in the sample nucleic acid sequences by mixing the sample nucleic acid sequences with unlabeled (or alternatively labeled) repetitive nucleic acid sequences. In one aspect, the sample nucleic acid sequences are first mixed with repetitive nucleic acid sequences before the step comprising contacting with the array-immobilized nucleic acid segments. The repetitive nucleic acid sequences can be unlabeled. The repetitive nucleic acid sequences can comprise Cot-1 DNA or equivalent, SST sequences or equivalent, or salmon sperm DNA or equivalent, or a combination thereof.
The invention provides an in silico, array-based method for performing comparative genomic hybridization (CGH), the method comprising: (a) providing a first array comprising a plurality of genomic nucleic acid segments, wherein each nucleic acid segment is immobilized to a discrete and known spot on a first substrate surface to form a first array of genomic nucleic acid segments and the plurality of genomic nucleic acid segments comprise a substantially complete genome or a known subset of a genome; (b) providing a second array comprising substantially the same plurality of genomic nucleic acid segments arrayed in step (a), wherein each nucleic acid segment is immobilized to a discrete and known spot on a second substrate surface to form a second array of genomic nucleic acid segments; (c) providing a first sample comprising a plurality of genomic nucleic acid sequences comprising a detectable label; (d) providing a second sample comprising a plurality of genomic nucleic acid sequences comprising a detectable label; (e) contacting the first sample of step (c) with the first array of step (a) under conditions wherein the labeled nucleic acid can specifically hybridize to the nucleic acid segments immobilized on the first array; (f) contacting the second sample of step (d) with the second array of step (b) under the same conditions as in step (e), thereby allowing the labeled nucleic acid to specifically hybridize to the nucleic acid segments immobilized on the second array; (g) identifying which spots on the first and the second substrate surfaces are specifically hybridized to a labeled nucleic acid segment and measuring the amount of label on each spot; and, (h) comparing the amount of labeled nucleic acid sequence bound by specific hybridization to an immobilized nucleic acid in the first array to the amount of labeled nucleic acid sequence bound by specific hybridization to the same (i.e., the equivalent) immobilized nucleic acid in the second array, thereby determining the relative amount of a nucleic acid sequence complementary to the nucleic acid in the first sample compared to the second sample and performing a comparative genomic hybridization.
The invention provides an in silico, array-based method of determining one or more variations in copy numbers of biological molecules in a first sample relative to copy numbers of substantially identical biological molecules in at least a second sample, the method comprising the steps of: (a) providing a first array and at least a second array, each comprising a plurality of immobilized biological molecules, wherein the biological molecules are immobilized to discrete and known spots on a substrate surface to form at least two arrays of biological molecules, and the second array comprises substantially the same plurality of biological molecules immobilized in the first array; (a) providing at least two samples comprising biological molecules and labeling biological molecules from each sample, wherein biological molecules in all samples comprise the same label or biological molecules in the first sample comprise a different label than biological molecules in the second label; (b) contacting the first sample of labeled biological molecules to the first array and the second sample of labeled biological molecules to the second array under conditions wherein the labeled sample biological molecules can specifically bind to the immobilized biological molecules; and (c) detecting the amount of label associated with each spot and comparing the amount of label associated with an immobilized biological molecule in the first array to the amount of label associated with the same immobilized biological molecule in the second array, thereby determining the amount of immobilized biological molecule in the first sample relative to the second sample.
The invention provides an in silico, array-based method of determining one or more variations in copy numbers of unique nucleic acid sequences in a first sample relative to copy numbers of substantially identical sequences in at least a second sample, the method comprising the steps of: (a) providing a first array and at least a second array, each comprising a plurality of immobilized nucleic acids, wherein the nucleic acids are immobilized to discrete and known spots on a substrate surface to form at least two arrays of nucleic acid segments, and the second array comprises substantially the same plurality of nucleic acid segments immobilized in the first array; (a) providing at least two nucleic acid samples and labeling the nucleic acid from each sample, wherein nucleic acids in all samples comprise the same label or the nucleic acid in the first sample comprises a different label than the nucleic acid in the second label; (b) contacting the first sample of labeled nucleic acid to the first array and the second sample of labeled nucleic acid to the second array under conditions wherein the labeled nucleic acids can specifically hybridize to the immobilized nucleic acids immobilized on the arrays; and, (c) detecting the amount of label associated with each spot and comparing the amount of label associated with a nucleic acid sequence in the first array to the amount of label associated with the same nucleic acid sequence in the second array, thereby determining one or more variations in copy numbers of unique nucleic acid sequences in a first sample relative to copy numbers of substantially identical (e.g., complementary) sequences in at least a second sample.
The method of the invention can further comprise determining the ratio of the amount of label associated with a biological molecule (e.g., a nucleic acid sequence) in the first and the second arrays, thereby determining a ratio of signal intensity.
The methods of the invention can further comprise determining the amount of a calibration molecule, wherein a known amount of a calibration molecule is spotted on each array. In one aspect, the methods comprise determining the average copy number of a calibration sequence, wherein a known amount of calibration sequence is mixed with the first and the second samples, and the calibration sequence is substantially the same as a unique sequence in an immobilized nucleic acid sequence present in both arrays. A known amount of a calibration molecule-binding composition can be mixed with the first and the second samples. Each array can comprise a calibration spot, wherein the calibration spot comprises a biological molecule from each spot on an array.
The method can further comprise determining the average copy number of a calibration sequence, wherein a known amount of a calibration sequence is spotted on each array, and (i) a known amount of a calibration sequence is mixed with the first and the second samples the calibration sequence is derived from an different source from which the sample nucleic acids were derived, or, (ii) the calibration sequences spotted on the array comprise at least one sequence of a nucleic acid from each of the array spots.
The method can further comprise determining whether the expected ratio of the known amount of calibration sequence is detected on the two arrays, and, if the expected ratio is not detected, determining a correction factor. The method can further comprise normalizing the ratio of the amount of label associated with the nucleic acid sequence in the first and the second array by adjusting the ratio by a figure representing the difference between the expected calibration sequence ratio and the detected of calibration sequence ratio on the (at least) two arrays. The methods of the invention further comprise determining (and outputting or displaying when a computer-implemented method) calibration, or normalization, curves based on binding to xe2x80x9ccalibrationxe2x80x9d or xe2x80x9ccontrolxe2x80x9d spots, as discussed in detail, below.
In one aspect, the calibration molecule is spotted in titrated concentrations on each of the arrays. The methods can further comprise determining whether the expected ratio of the known amount of calibration molecule is detected on the two arrays. The methods can further comprise normalizing the ratio of the amount of label associated with a biological molecule in the first and second arrays by adjusting the ratio by a figure representing the difference between the expected ratio of calibration molecules and the detected ratio of calibration molecules on the two arrays.
The invention provides a kit comprising the following components: (a) (i) at least two arrays, each comprising a plurality of biological molecules, wherein each biological molecules is immobilized to a discrete and known spot on a substrate surface to form an array, or (ii) a biochip comprising a first substrate surface comprising a first array and a second substrate surface comprising a second array, wherein the first and second arrays are separated by a hydrophobic barrier such that a first sample can be applied to the first array at the same time a second sample is applied to the second array without the two samples mixing together; and, (b) instructions for using the arrays comprising a method of the invention.
The invention provides a kit comprising the following components: (a) at least two arrays, each comprising a plurality of cloned genomic nucleic acid segments, wherein each genomic nucleic acid segment is immobilized to a discrete and known spot on a substrate surface to form an array and the cloned genomic nucleic acid segments comprise a substantially complete genome or a known subset of a genome; and, (b) instructions for using the array comprising a method of the invention. The kit can further comprise materials to prepare a sample comprising a nucleic acid (e.g., a genomic DNA) for application to the array. This can includes, e.g., instructions and compositions to fragment/cut and/or label the nucleic acid. The kit can further comprise a sample of wild type, or normal, nucleic acid. The wild type, or normal, nucleic acid can comprise a label. The wild type, or normal, nucleic acid of the kit can comprise a human wild type genomic nucleic acid. The kit can include an array comprising a G-CHIP(trademark), a mouse BAC array or a human BAC Array.
The invention provides a computer program product in a computer readable medium for determining the relative amount of a biological molecule (e.g., a nucleic acid) in two or more samples comprising: a computer useable medium comprising a computer readable program code embodied therein, wherein the computer program product is capable of determining the relative amount of a biological molecule (e.g., a nucleic acid) in two or more samples by a process comprising the following steps: (a) collecting data comprising which spots on a first array substrate surface and at least a second array substrate surface are specifically bound (e.g., hybridized) to a labeled biological molecule (e.g., a nucleic acid) and the amount of labeled biological molecule (e.g., nucleic acid) on each spot, wherein the data is generated by a method of the invention; and, (b) comparing the amount of labeled biological molecule (e.g., nucleic acid) bound (e.g., by specific hybridization) to a biological molecule (e.g., nucleic acid) immobilized on the first array to the amount of labeled biological molecule (e.g., nucleic acid) bound (e.g., by specific hybridization) to the same biological molecule (e.g., nucleic acid) immobilized on the second array by comparing the data collected in step (a), thereby determining the relative amount of a biological molecule (e.g., nucleic acid) in the first sample compared to the second sample.
The invention provides a computer-implemented method for determining the relative amount of a biological molecule (e.g., nucleic acid) in two or more samples comprising the following steps: (a) identifying which spots on a first array substrate surface and at least a second array substrate surface are specifically bound (e.g. hybridized) to a labeled biological molecule (e.g., nucleic acid) and the amount of labeled biological molecule (e.g., nucleic acid) on each spot, wherein the data is generated by a method of the invention, and communicating this data to a computer program product; (b) comparing the amount of labeled biological molecule (e.g., nucleic acid) bound (e.g., by specific hybridization) to a biological molecule (e.g., nucleic acid) immobilized on the first array to the amount of labeled biological molecule (e.g., nucleic acid) bound (e.g., by specific hybridization) to the same biological molecule (e.g., nucleic acid) immobilized on the second array by comparing the data communicated in step (a) and using a computer program product of the invention, thereby determining the relative amount of a biological molecule (e.g., nucleic acid) in the first sample compared to the second sample.
The invention provides a computer system, comprising: (a) a processor; and, (b) a computer program product of the invention.
The invention provides an array for determining the relative amount of a biological molecule (e.g., a nucleic acid) a sample comprising a plurality of biological molecules immobilized to a plurality of discrete and known spots on a substrate surface to form an array of biological molecules, wherein the array of spots comprises a plurality of test spots (i.e., for binding, e.g., by hybridization, to molecules in a sample) and at least one calibration spot, and the calibration spot comprises at least one copy of a sequence from each test spot on the array. In one aspect, the calibration spot comprises an equimolar mixture of all the biological molecules spotted on the array. The array can further comprise at least a second calibration spot. The additional calibration spots can comprise at least one copy of a sequence from each test spot on the array. In one aspect, additional calibration spots comprise an equimolar dilution of (or increase in) the mixture of biological molecules spotted on a first calibration spot. In one aspect, the array comprises a plurality of calibration spots. Each calibration spot can represent a different equimolar dilution of the mixture of biological molecules spotted on the array. As discussed in detail, below, the xe2x80x9ccontrol spotsxe2x80x9d or xe2x80x9ccalibration spotsxe2x80x9d are used for xe2x80x9cnormalizationxe2x80x9d of data generated in one or more arrays, e.g., in the in silico array-based methods of the invention. Control spots can provide a consistent result independent of the labeled sample bound, e.g., hybridized, to the array. The control spots can be used to generate a xe2x80x9cnormalizationxe2x80x9d or xe2x80x9ccalibrationxe2x80x9d curve to offset possible intensity errors between the two or more arrays. In one aspect of the methods of the invention, xe2x80x9ccalibrationxe2x80x9d curves are generated using arrays comprising a plurality of xe2x80x9ccontrol spots.xe2x80x9d The computer-implemented methods, computer program products and computer systems of the invention can calculate and display calibration/normalization curves from binding (e.g., hybridization) data read from control spots from two or more arrays.
In one aspect, the array of the invention comprises a first substrate surface comprising a first array and a second substrate surface comprising a second array, wherein the first and second arrays are separated by a hydrophobic barrier such that a first sample can be applied to the first array at the same time a second sample is applied to the second array without the two samples mixing together, and the first and the second arrays comprise the same calibration spots.
In one aspect, the biological molecule comprises a nucleic acid, such as a DNA (e.g., cDNA or a genomic DNA) or an RNA (e.g., mRNA). Alternatively, the biological molecule can comprises a polypeptide, a peptide, a lipid or a polysaccharide.
The invention provides a multiplexed system for performing comparative genomic hybridization (CGH) using an array comprising: (a) an array comprising (i) a plurality of biological molecules immobilized to a plurality of discrete and known spots on a substrate surface to form an array of biological molecules, wherein the array of spot comprises a plurality of test spots and at least one calibration spot, and the calibration spot comprises at least one copy of a sequence from each test spot on the array, or (ii) a first substrate surface comprising a first array and a second substrate surface comprising a second array, wherein the first and second arrays are separated by a hydrophobic barrier such that a first sample can be applied to the first array at the same time a second sample is applied to the second array without the two samples mixing together, and the first and the second arrays comprise the same calibration spots; (b) a device for detecting a detectable label, wherein the device can measure which detectable labels are on which spots on the substrate surface.
The details of one or more embodiments of the invention are set forth in the accompanying drawings and the description below. Other features, objects, and advantages of the invention will be apparent from the description and drawings, and from the claims.
All publications, GenBank Accession references (sequences), ATCC Deposits, patents and patent applications cited herein are hereby expressly incorporated by reference for all purposes.