The invention relates in general to the reproducible, mass-production of nucleic acid arrays.
Arrays of nucleic acid molecules are of enormous utility in facilitating methods aimed at genomic characterization (such as polymorphism analysis and high-throughput sequencing techniques), screening of clinical patients or entire pedigrees for the risk of genetic disease, elucidation of protein/DNA- or protein/protein interactions or the assay of candidate pharmaceutical compounds for efficacy; however, such arrays are both labor-intensive and costly to produce by conventional methods. Highly ordered arrays of nucleic acid fragments are known in the art (Fodor et al., U.S. Pat. No. 5,510,270; Lockhart et al., U.S. Pat. No. 5,556,752). Chetverin and Kramer (WO 93/17126) are said to disclose a highly ordered array which may be amplified.
U.S. Pat. No. 5,616,478 of Chetverin and Chetverina reportedly claims methods of nucleic acid amplification, in which pools of nucleic acid molecules are positioned on a support matrix to which they are not covalently linked. Utermohlen (U.S. Pat. No. 5,437,976) is said to disclose nucleic acid molecules randomly immobilized on a reusable matrix.
There is need in the art for improved methods of nucleic acid array design and production.
The invention provides a method of producing a plurality of a nucleic acid array, comprising, in order, the steps of amplifying in situ nucleic acid molecules of a first randomly-patterned, immobilized nucleic acid array comprising a heterogeneous pool of nucleic acid molecules affixed to a support, transferring at least a subset of the nucleic acid molecules produced by such amplifying to a second support, and affixing the subset so transferred to the second support to form a second randomly-patterned, immobilized nucleic acid array, wherein the nucleic acid molecules of the second array occupy positions that correspond to those of the nucleic acid molecules from which they were amplified on the first array, so that the first array serves as a template to produce a plurality.
As used herein in reference to nucleic acid arrays, the term xe2x80x9cpluralityxe2x80x9d is defined as designating two or more such arrays, wherein a first (or xe2x80x9ctemplatexe2x80x9d) array plus a second array made from it comprise a plurality. When such a plurality comprises more than two arrays, arrays beyond the second array may be produced using either the first array or any copy of it as a template.
As used herein, the terms xe2x80x9crandomly-patternedxe2x80x9d or xe2x80x9crandomxe2x80x9d refer to a non-ordered, non-Cartesian distribution (in other words, not arranged at pre-determined points along the x- and y axes of a grid or at defined xe2x80x98clock positionsxe2x80x99, degrees or radii from the center of a radial pattern) of nucleic acid molecules over a support, that is not achieved through an intentional design (or program by which such a design may be achieved) or by placement of individual nucleic acid features. Such a xe2x80x9crandomly-patternedxe2x80x9d or xe2x80x9crandomxe2x80x9d array of nucleic acids may be achieved by dropping, spraying, plating or spreading a solution, emulsion, aerosol, vapor or dry preparation comprising a pool of nucleic acid molecules onto a support and allowing the nucleic acid molecules to settle onto the support without intervention in any manner to direct them to specific sites thereon.
As used herein, the terms xe2x80x9cimmobilizedxe2x80x9d or xe2x80x9caffixedxe2x80x9d refer to covalent linkage between a nucleic acid molecule and a support matrix.
As used herein, the term xe2x80x9carrayxe2x80x9d refers to a heterogeneous pool of nucleic acid molecules that is distributed over a support matrix; preferably, these molecules differing in sequence are spaced at a distance from one another sufficient to permit the identification of discrete features of the array.
As used herein, the term xe2x80x9cheterogeneousxe2x80x9d is defined to refer to a population or collection of nucleic acid molecules that comprises a plurality of different sequences; it is contemplated that a heterogeneous pool of nucleic acid molecules results from a preparation of RNA or DNA from a cell which may be unfractionated or partially-fractionated.
An xe2x80x9cunfractionatedxe2x80x9d nucleic acid preparation is defined as that which has not undergone the selective removal of any sequences present in the complement of RNA or DNA, as the case may be, of the biological sample from which it was prepared. A nucleic acid preparation in which the average molecular weight has been lowered by cleaving the component nucleic acid molecules, but which still retains all sequences, is still xe2x80x9cunfractionatedxe2x80x9d according to this definition, as it retains the diversity of sequences present in the biological sample from which it was prepared.
A xe2x80x9cpartially-fractionatedxe2x80x9d nucleic acid preparation may have undergone qualitative size-selection. In this case, uncleaved sequences, such as whole chromosomes or RNA molecules, are Us selectively retained or removed based upon size. In addition, a xe2x80x9cpartially-fractionatedxe2x80x9d preparation may comprise molecules that have undergone selection through hybridization to a sequence of interest; alternatively, a xe2x80x9cpartially-fractionatedxe2x80x9d preparation may have had undesirable sequences removed through hybridization. It is contemplated that a xe2x80x9cpartially-fractionatedxe2x80x9d pool of nucleic acid molecules will not comprise a single sequence that has been enriched after extraction from the biological sample to the point at which it is pure, or substantially pure.
In this context, xe2x80x9csubstantially purexe2x80x9d refers to a single nucleic acid sequence that is represented by a majority of nucleic acid molecules of the pool. Again, this refers to enrichment of a sequence in vitro; obviously, if a given sequence is heavily represented in the biological sample, a preparation containing it is not excluded from use according to the invention.
As used herein, the term xe2x80x9cbiological samplexe2x80x9d refers to a whole organism or a subset of its tissues, cells or component parts (e.g. fluids). xe2x80x9cbiological samplexe2x80x9d further refers to a homogenate, lysate or extract prepared from a whole organism or a subset of its tissues, cells or component parts, or a fraction or portion thereof. Lastly, xe2x80x9cbiological samplexe2x80x9d refers to a medium, such as a nutrient broth or gel in which an organism has been propagated, which contains cellular components, such as nucleic acid molecules.
As used herein, the term xe2x80x9corganismxe2x80x9d refers to all cellular life-forms, such as prokaryotes and eukaryotes, as well as non-cellular, nucleic acid-containing entities, such as bacteriophage and viruses.
As used herein, the term xe2x80x9cfeaturexe2x80x9d refers to each nucleic acid sequence occupying a discrete physical location on the array; if a given sequence is represented at more than one such site, each site is classified as a feature. In this context, the term xe2x80x9cnucleic acid sequencexe2x80x9d may refer either to a single nucleic acid molecule, whether double or single-stranded, to a xe2x80x9cclonexe2x80x9d of amplified copies of a nucleic acid molecule present at the same physical location on the array or to a replica, on a separate support, of such a clone.
As used herein, the term xe2x80x9camplifyingxe2x80x9d refers to production of copies of a nucleic acid molecule of the array via repeated rounds of primed enzymatic synthesis; xe2x80x9cin situ amplificationxe2x80x9d indicates that such amplifying takes place with the template nucleic acid molecule positioned on a support according to the invention, rather than in solution.
As used herein, the term xe2x80x9csupportxe2x80x9d refers to a matrix upon which nucleic acid molecules of a nucleic acid array are immobilized; preferably, a support is semi-solid.
As used herein, the term xe2x80x9csemi-solidxe2x80x9d refers to a compressible matrix with both a solid and a liquid component, wherein the liquid occupies pores, spaces or other interstices between the solid matrix elements.
As used herein in reference to the physical placement of nucleic acid molecules or features and/or their orientation relative to one another on an array of the invention, the terms xe2x80x9ccorrespondxe2x80x9d or xe2x80x9ccorrespondingxe2x80x9d refer to a molecule occupying a position on a second array that is either identical to- or a mirror image of the position of a molecule from which it was amplified on a first array which served as a template for the production of the second array, or vice versa, such that the arrangement of features of the array relative to one another is conserved between arrays of a plurality.
As implied by the above statement, a first and second array of a plurality of nucleic acid arrays according to the invention may be of either like or opposite chirality, that is, the patterning of the nucleic acid arrays may be either identical or mirror-imaged.
As used herein, the term xe2x80x9creplicaxe2x80x9d refers to any nucleic acid array that is produced by a printing process according to the invention using as a template a first randomly-patterned immobilized nucleic acid array.
In a preferred embodiment, the method further comprises the step of after the step of transferring at least a subset of the nucleic acid molecules produced by amplifying the molecules of the first array to a second support of repeating that step, such that another subset of the nucleic acid molecules produced by amplifying the molecules of the first array are transferred and affixed to an additional second support.
Preferably, after the step of transferring amplified nucleic acid molecules to a second support, the nucleic acid molecules remaining on the first support are amplified prior to repeating the transferring of amplified nucleic acid molecules to an additional second support.
In another preferred embodiment, the method further comprises, after the step of transferring at least a subset of the nucleic acid molecules produced by amplifying the molecules of the first array to a second support, the step of transferring and affixing at least a subset of the molecules transferred to the second support to a third support.
Preferably, the method further comprises the step of amplifying the nucleic acid molecules of the second array.
It is preferred that the pool of nucleic acid molecules is prepared from RNA or DNA.
It is additionally preferred that pool of nucleic acid molecules comprises cDNA or genomic DNA.
Preferably, the pool of nucleic acid molecules is a library.
In a preferred embodiment, the pool of nucleic acid molecules is prepared by cloning genomic DNA or cDNA into a cloning site on a nucleic acid vector and subsequently cleaving the nucleic acid molecules from the vector, wherein the cloning site is flanked on either side by oligonucleotide sequences that will remain linked to the nucleic acid molecules after cleaving.
It is preferred that the oligonucleotide sequences comprise recognition sites for a restriction enzyme(s), and particularly preferred that subsequent cleavage of the nucleic acid molecules of the library to which the sites are linked with the enzyme(s) results in the release of pairs of oligonucleotide primers that comprise sequences unique to either end of each member of the library.
Preferably, the recognition sites are those of type IIS restriction enzymes.
As used herein, the term xe2x80x9ctype IISxe2x80x9d refers to a restriction enzyme that cuts at a site remote from its recognition sequence. Such enzymes are known to cut at a distances from their recognition sites ranging from 0 to 20 base pairs.
It is preferred that the support is semi-solid.
Preferably, the semi-solid support is selected from the group that includes polyacrylamide, cellulose, polyamide (nylon) and cross-linked agarose, -dextran and -polyethylene glycol.
It is particularly preferred that amplifying of nucleic acid molecules of is performed by polymerase chain reaction (PCR).
Preferably, affixing of nucleic acid molecules to the support is performed using a covalent linker that is selected from the group that includes oxidized 3-methyl uridine, an acrylyl group and hexaethylene glycol.
It is also contemplated that affixing of nucleic acid molecules to the support is performed via hybridization of the members of the pool to nucleic acid molecules that are covalently bound to the support.
Preferably, the nucleic acid molecules bound to the support are synthetic oligonucleotides.
As used herein in this context, the term synthetic oligonucleotide refers to a short (10 to 1,000 nucleotides in length), double- or single-stranded nucleic acid molecule that is chemically synthesized or is the product of a biological system such as a product of primed or unprimed enzymatic synthesis.
It is preferred that the transferring of nucleic acid molecules from the first array to the second support comprises contacting the first array with a support, such that at least a subset of the nucleic acid molecules produced by amplifying are transferred to the support.
In another preferred embodiment, the transferring comprises contacting the first array with a carrier selected from the group that includes a cylindrical roller, a stamping device, a membrane and a support, such that at least a subset of the nucleic acid molecules produced by amplifying are transferred to the carrier, and subsequently contacting the carrier with a support.
The present invention also encompasses a plurality of a nucleic acid array, wherein the plurality comprises a first template randomly-patterned, immobilized nucleic acid array comprising a pool of nucleic acid molecules randomly immobilized on a support, and a second randomly-patterned, immobilized nucleic acid array, wherein the nucleic acid molecules of the second array are nucleic acid amplification products of the pool and wherein the nucleic acid molecules of the second array occupy positions on the second array that correspond to those of the nucleic acid molecules from which they were amplified on the first array.
Another aspect of the present invention is a method for determining the sequential order of genetic elements of a chromosome, comprising providing an immobilized nucleic acid array, comprising the steps of providing an immobilized chromosome, amplifying the nucleic acid sequences of the chromosome, contacting the amplified sequences with a semi-solid support, such that a subset of nucleic acid molecules produced by amplifying are retained by the support, and covalently affixing the molecules so retained to the support to form a first immobilized nucleic acid array, wherein the positions of the members of the array correspond to the positions of the DNA sequences from which they were amplified on the chromosome, and determining the order of genetic elements on the chromosome, wherein ordering comprises identifying the features of the array, wherein the position of a first feature relative that of a second feature on the array corresponds to the position of a first genetic element relative to that of a second genetic element on the chromosome.
It is preferred that the amplifying is performed by PCR.
Preferably, identifying is performed using sequencing by hybridization (SBH), quantitative incremental fluorescent nucleotide addition sequencing (QIFNAS) or stepwise ligation and cleavage.
In a preferred embodiment, the method further comprises the steps after contacting the amplified sequences of the chromosome with a support of amplifying the molecules of the first array by PCR and contacting the first array with a second support, such that at least a subset of the amplified nucleic acid molecules are transferred to the support, and covalently affixing the nucleic acid molecules to the second support to form a second immobilized nucleic acid array, wherein the positions of the members of the second array correspond to their positions on the first array.
Another aspect of the present invention is a method for localizing RNA molecules within a cell or a tissue section, comprising providing an immobilized nucleic acid array, comprising the steps of providing an immobilized cell or a tissue section, reverse transcribing RNA molecules of the cell or tissue section to produce an array of features comprising reverse transcripts, contacting the array with a support, such that at least a subset of reverse transcripts are retained by the support, covalently affixing the reverse transcripts to the support to form an immobilized nucleic acid array, and localizing the RNA molecules, comprising identifying the features of the array, wherein the positions of features on the array correspond to the positions of the RNA molecules in the cell or tissue section.
It is preferred that the method further comprises the step of amplifying the reverse transcripts.
It is additionally preferred that the amplifying is performed by PCR.
Preferably, the identifying is performed using sequencing by hybridization (SBH), quantitative incremental fluorescent nucleotide addition sequencing (QIFNAS) or stepwise ligation and cleavage.
In a preferred embodiment, the method further comprises the steps after contacting the reverse transcripts with a support of amplifying the molecules of the first array by PCR and contacting the first array with a second support, such that at least a subset of the amplified nucleic acid molecules are transferred to the support, and covalently affixing the nucleic acid molecules to the second support to form a second immobilized nucleic acid array, wherein the positions of the members of the second array correspond to the positions of the molecules from which they were amplified on the first array.
The invention also encompasses a method of obtaining a plurality of immobilized nucleic acid arrays, wherein the arrays of the plurality are derived from different nucleic acid pools, comprising the steps of providing a first immobilized nucleic array acid comprising a first pool of nucleic acid molecules that have linked to both ends oligonucleotide sequences each comprising a restriction enzyme(s) recognition site, such that cleavage of the nucleic acid molecules of the pool with the enzyme(s) results in the release of pairs of oligonucleotide primers that comprise sequences unique to either end of each member of the pool, amplifying by PCR the nucleic acid molecules of the array, contacting the first immobilized nucleic acid array with a support, such that at least a subset of nucleic acid molecules produced by amplifying are transferred to the support, covalently affixing the nucleic acid molecules to the support to form a replica of the first immobilized nucleic acid array, wherein the positions of nucleic acid molecules on the replica correspond to the positions of the nucleic acid molecules of the first array from which they were amplified, cleaving the nucleic acid molecules of the replica with the restriction enzyme(s), thereby forming an array of immobilized oligonucleotide primers that comprise sequences unique to either end of each feature of the first nucleic acid array, washing from the oligonucleotide primers the nucleic acid fragments released from them by the cleaving, contacting the primers with a second pool of nucleic acid molecules under conditions that permit hybridization of the nucleic acid molecules that are complementary, such that hybridization occurs between the oligonucleotide primers and the nucleic acid molecules of the second nucleic acid pool, amplifying the nucleic acid molecules of the second pool so hybridized to the primers, wherein the immobilized oligonucleotide primers to which they are hybridized serve to prime the amplifying, thereby forming an immobilized array of nucleic acid molecules of the second pool.
Preferably, the amplifying is performed by PCR.
It is preferred that cycles of the steps of contacting the first immobilized nucleic acid array with a support, such that at least a subset of nucleic acid molecules produced by amplifying are transferred to the support, covalently affixing the nucleic acid molecules to the support to form a replica of the first immobilized nucleic acid array, wherein the positions of nucleic acid molecules on the replica correspond to the positions of the nucleic acid molecules of the first array from which they were amplified, cleaving the nucleic acid molecules of the replica with the restriction enzyme(s), thereby forming an array of immobilized oligonucleotide primers that comprise sequences unique to either end of each feature of the first nucleic acid array, washing from the oligonucleotide primers the nucleic acid fragments released from them by the cleaving, contacting primers with a second pool of nucleic acid molecules under conditions that permit hybridization of the nucleic acid molecules that are complementary, such that hybridization occurs between the oligonucleotide primers and the nucleic acid molecules of the second nucleic acid pool, amplifying the nucleic acid molecules of the second pool so hybridized to the primers, wherein the immobilized oligonucleotide primers to which they are hybridized serve to prime the amplifying, thereby forming an immobilized array of nucleic acid molecules of the second pool are repeated.
Preferably, the method further comprises the steps, between contacting the first immobilized nucleic acid array with a support, such that at least a subset of nucleic acid molecules produced by amplifying are transferred to the support, and covalently affixing the nucleic acid molecules to the support to form a replica of the first immobilized nucleic acid array, wherein the positions of nucleic acid molecules on the replica correspond to the positions of the nucleic acid molecules of the first array from which they were amplified, of amplifying in situ the nucleic acid molecules of the replica, and contacting the replica with a second support, such that at least a subset of the nucleic acid molecules produced by the amplifying are transferred to the second support, covalently affixing the termini of the nucleic acid molecules to the blank to form a second replica of the first immobilized nucleic acid array, wherein the positions of nucleic acid molecules on the second replica correspond to the positions of the nucleic acid molecules of the replica from which they were amplified.
It is preferred that the different nucleic acid pools are obtained from different tissues of an individual organism.
It highly preferred that the different nucleic acid pools are obtained from different individual organisms of a single species.
It is also highly preferred that the different nucleic acid pools are obtained from organisms of different species.
Preferably, the first and second pools of nucleic acid molecules are libraries.