The present invention relates to methods and devices for producing a replicate or derivative from an array of molecules, such as biomolecules or chemically produced molecules, and, in particular, to such methods and devices that are suitable for producing a replicate or derivative of a microarray of said molecules and/or molecules derived therefrom, such as of a DNA microarray, RNA microarray or protein microarray, and to the application of the array for identifying DNA sequences associated with reactions involving primary sequences, copies thereof or derivatives thereof.
A microarray is understood to mean an arrangement of many different biomolecules on or in a surface in individual points. Said points are also referred to as spots and typically have a diameter ranging from 10 μm to about 1000 μm. One or several identical populations of biomolecules are present within a spot. Except for some intentional redundancies, the various spots, however, represent different biomolecules. The biomolecules may be deposited on the surface, may exist in a layer on the surface, may exist within a cavity, or may exist in an immobilized manner on or in a particle, it being possible for the particles to be arranged as an array.
Conventionally, there have been various techniques of producing microarrays. In accordance with one technique, the biomolecules are synthesized direct on the surface (in-situ synthesis), for example using light synthesis, chemical synthesis, spot synthesis, a printing process, and the like. Such a light-synthesis technique is employed, for example, by Affymetrix, spot synthesis is performed by Agilent. Combimatrix produces DNA microarrays by the means of virtual, electronically addressable reaction compartments. In accordance with a further technique, the (bio)molecules are at first synthesized and subsequently deposited on the surface as an arranged array, such a technique being employed, for example, by Agilent, Gesim and Biofluidix. Both techniques need a high level of technical expenditure. Said technical expenditure increases more than linearly as the number of different biomolecules increases and as the diameters of the deposition spots decrease. In addition, the amount of time involved as well as the costs increase significantly when such a microarray is to contain, e.g., twice as many substances, or if the size of the coated structures, i.e. spots, is to be reduced. A stamping technique for producing microarrays is described in [1].
By means of on-site synthesis on an array, it is possible to produce millions of different DNA sequences. However, in order to achieve a new layout or a different pattern size, it is needed to reorganize the entire manufacturing process. This will then need new instrument settings and, in the event of light-aided synthesis, even new photolithography masks or reprogramming of the digital minor system, see [2], where utilization of digital mirrors for creating a microarray is described. This is circumvented as far as possible in order to keep costs down.
For lack of time alone it is not possible to transfer more than a few tens of thousands of substances by means of synthesis in the laboratory and by means of subsequent transmission to a microarray, for example by means of a nanoplotter. It would take weeks or months to produce an appreciable number of dots on a microarray with a million different biomolecules. Whilst that time the surface chemistry will change and the whole microarray won't work any more.
Therefore, a method would be desirable by means of which it is possible to copy, in a simple and inexpensive manner, existing microarrays, that is, a regular arrangement of known biomolecules that are complicated and expensive to produce.
Some basic ideas on this issue have already been submitted, [3] and [4]. [3] and [4] disclose a method of replicating an oligonucleotide array wherein one or more biotin-functionalized oligonucleotides are hybridized into one or more oligonucleotides and amplified on a first substrate. The biotin-functionalized and amplified oligonucleotides are then anchored to a second substrate with streptavidin. The biotin-functionalized oligonucleotides may be separated from the oligonucleotides by mechanical force so as to create a replicated array. However, such copying processes are costly and need an additional biochemical anchoring system and in many cases could only produce a negative copy of the original DNA microarray.
[5] also describes copying of a DNA array by using a streptavidin/biotin system. [6] describes how DNA can be copied into RNA.
For about 30 years, DNA has been amplified in the laboratory. Inter alia, polymerase chain reaction (PCR) has made its arrival in almost all laboratories as a standard technique, and it is the foundation of most genetic studies. However, there are also other techniques enabling DNA to be multiplied, e.g. NASBA, recombinase polymerase amplification, rolling-circle amplification, and various other isothermal amplification techniques.
Not only do said techniques generally enable DNA to be multiplied, but they also enable targeted multiplication of individual DNA areas or subsets of the DNA. By means of specifically selecting the start points (primers), it is also possible to specifically multiply individual areas of the DNA. Most DNA amplification processes take place in solution, and this is referred to as a liquid-phase reaction. However, in the last few years, several methods have come up which utilize an additional solid phase for DNA amplification and in the process enrich same on said solid phase. E.g. the primer extension reaction on slide or solid phase [9,10]. In the following, two of the most common methods will be described, the foundations of bridge amplification of DNA as well as the water-in-oil emulsion PCR.
Bridge amplification of DNA: for bridge amplification, the (partly unknown) DNA is initially extended, at both ends, with known, so-called adapter sequences. Said extensions serve as binding sites for complementary sequences on the surface. It is only after binding to the surface has taken place that, later on, amplification will occur. The DNA strand that has been copied and, thus, newly created is now fixedly (covalently) bound to the surface, and has a further binding site at its non-bound end. Said further binding site may now also bind to a suitable counterpart on the surface and start a further amplification, which in turn will create a new DNA strand bound at one end and having the original binding sequence at the other, free end. In this manner, more and more new strands are generated, in an exponential manner, which are fixedly bound at one end, and whose other end enables temporary binding to the surface. During the amplification, the original strand is fixedly (covalently) bound at one end, and loosely (non-covalently) bound at the other end, and thus generates a molecular bridge. In this respect, [11] generally describes bridge amplification, and [12] describes utilization of bridge amplification for sequencing.
For a water-in-oil emulsion PCR, a type of bridge amplification is employed. This involves initially extending the DNA strands on both sides by means of adapter sequences, like for bridge amplification. Subsequently, the extended DNA is mixed together with an aqueous PCR mixture and solid-phase particles—also referred to as beads—and emulsified in oil, so that a water-and-particles-in-oil emulsion results. For this water-in-oil emulsion, the concentrations are selected such that ideally, precisely one DNA strand and precisely one particle will be trapped within each droplet of water. In accordance with bridge amplification, the surface of the particle contains sequences that enable a DNA copy to be covalently bound thereto. In this manner, the entire particle may be covered with copies of the original DNA by means of amplification. This technique is used mainly in sequencers. In this technique, only one single defined strand is amplified, at any one time, on the solid phase or liquid phase.
In protein amplification, or protein synthesis, there is a DNA strand that may basically be transcribed initially into RNA and then into a protein by means of a suitable biochemical system. If the RNA is sufficiently stable, or if there are a sufficient number of DNA templates, a large number of proteins can be produced. This technique corresponds to the natural process occurring within a cell which involves creating proteins from DNA via RNA, and it is the foundation and central paradigm of biochemistry. Since recently, simplified biochemical systems have been available which are capable of mastering this complex of tasks and thus enable producing, at least in principle, a protein from a DNA strand in the laboratory. In this respect, [7] describes a method of directly producing a protein microarray from a DNA microarray, and [8] describes a method of producing a protein microarray with cDNA anchors. Alternatively, protein amplification may also be performed using prokaryotic or eukaryotic cells which have protein-coding DNA introduced into them.
For decoding a DNA sequence, so-called sequencing methods are employed, an overview of relatively recent sequencing methods being provided in [13]. In addition, sequencing methods wherein DNA is bound to particles are described in [14].
The highly complex machines used for sequencing employ a multiplicity of reaction steps and techniques for initially trapping DNA that has been isolated, for multiplying it and for subsequently reading it out building block by building block. By means of the selected reaction chemistry and the sequencing method, it is possible, by means of expensive bioinformatics methods, to re-calculate the DNA sequence as a whole, and to thus obtain the genome of the species studied.
Previous sequencing techniques comprised splitting the DNA within a gel. This was an approach not based on solid phases and is called Sanger Sequencing. With the sequencers of the most recent generation, one works with a water-in-oil emulsion PCR and thus generates millions of particles, e.g. beads, which carry many identical copies of different DNA fragments, respectively. For reading out the sequence, the particles are arranged, e.g., in a so-called PicoTiterPlate™ having e.g. 1.3 to 3.4 million different microcavities, and are immobilized. This already represents a microarray as such. In this respect, please refer to [15], where utilization of bridge amplification for sequencing is described.
Even if a regular arrangement of biomolecules has already been produced in this manner, it nevertheless cannot be used like a conventional microarray with known sequences, since the individual sequences of the biomolecules bound to the particles are not yet known per se. However, after sequencing, the sequence of the DNA fragment bound to a specific particle will be known per se.
Efforts have already been made to retrieve the individual particles and reuse them as an array, for example on the part of Scineon together with the Max-Planck-Institut für Molekulare Genetik [Max-Planck-Society for Molecular Genetics] in Berlin. However, this method is costly and enables producing only one specimen of such an array.
Soft lithography, or microcontact printing, is a stamping technique that enables depositing molecules on a surface and to subsequently transfer same to another surface. It also enables integrating small cavities or microfluidics and to thus provide complex circuits for liquids. Said circuits enable treating surfaces in a specific manner and thus coating or modifying extremely small structures. The material used for this purpose is a silicone (PDMS). By means of suitable surface modification of the PDMS, various biomolecules can be added to the surface, and thus be transferred later on. Both DNA and RNA as well as biomolecules may be transferred.
These transfer properties may be exploited, inter alia, for a copying step, [16] being the first to describe how biomolecules may be transferred using soft lithography.
DNA arrays, or DNA microarrays, are mainly used for so-called expression analysis, sequencing and amount of genes or SNP analysis.
In expression analysis, one wishes to study the level of activity of specific genes. mRNA is considered to be a marker for this. For this purpose, cells or living beings are stimulated, for example by administering a drug, by changing environmental factors, by putting them under stress, and the like. From the biological material, one initially collects the mRNA, transcribes it into a so-called complementary DNA (cDNA), and provides it with a dye. A reference sample is provided with a different dye. Mostly, green and red dyes are used. Equal proportions of the samples are mixed together and then applied to the microarray. If a specific DNA sequence is contained, in equal concentrations, in both original samples, complementary molecules from both samples will bind, in equal concentrations, to the respective spot of the microarray. Reading out this spot therefore results in a secondary color. In the case of green and red, yellow will result. If there are unequal proportions of the same gene sequence, the corresponding spot on the microarray will comprise a secondary color whose coloration will represent the predominant gene product. Genes that are switched on or off completely will have only one hue or the other. The color pattern allows to infer the amount of mRNA and provides a clue as to how strongly specific genes have been activated or deactivated by the influences studied. [17] discloses application of expression profiling for genome-wide studies by means of highly parallel sequencings.
In SNP analyses, one investigates whether gene sequences comprise individual mutations, i.e. sequences that are identical except for one base pair (replacement of individual base pairs=single-nucleotide polymorphism). The precise location of the replaced base pair specifies whether the replacement has no or only minor effects on the organism, or whether this is a lethal gene defect. In the case of several serious hereditary diseases such as Huntington's Chorea, Parkinson's disease or Alzheimer's disease, such severe SNPs are known. With many other SNPs, one may infer increased risks or susceptibilities to specific diseases such as, e.g., diabetes or rickets. For SNP analysis, the DNA is directly collected from the biological material and marked with a dye. For each SNP, there are four spots on the microarray, which differ, in the same position in each case, by one base pair. From the basis of the binding position one can specify which base is in the relevant position in the unknown sample, see [17].
With protein arrays, production alone is considerably more difficult, since unlike DNA, proteins have an enormously broad spectrum of solubilities, reactivities and specificities. Therefore, it is not trivial to bind various proteins on a surface using the same chemical anchor. Typically, protein arrays contain several hundred to one thousand different proteins. Protein arrays are predominantly employed for binding studies. This comprises placing a marked molecule onto the protein array. Such spots on the microarray which comprise coloring are therefore potential binding partners for the molecule studied. This technique is employed, inter alia, for epitope mapping in order to specifically find binding sites.
The architecture of a microarray as such is described in [18].
The applications of microarrays are therefore far-reaching and manifold. However, due to the lack of the needed financial resources, or due to their cost-benefit ratio, they are restricted considerably in use.
DNA sequences contain biochemical information and may be multiplied by means of a biochemical replication system. Various approaches to copying DNA sequences have already been pursued, starting from copying the individual base pairs onto a surface, up to the approach described in [4], [5] and [6], which set forth how DNA can be copied, in principle, from one surface to another.
Also, an article [7] has recently been published about how a protein copy can be made from a DNA array.
The above illustrations show that the standard technology enables two fundamental techniques of producing microarrays, namely directly producing substances on site, on the one hand, and transferring the substances, after a synthesis, by means of a microscopic dispensing or printing process, on the other hand. Each of these techniques is technically complex and involves high expenditure in terms of time and cost, the general rule being that as the number of substances doubles, the time and cost needed will also at least double. In addition, to have precise knowledge of the biochemical information—in the case of DNA arrays, of the sequence—prior to synthesis is essential. To obtain said information in the case of DNA, so-called sequencers are used, as was explained above. No direct production chain between the sequencing and the fabrication of microarrays has so far been known or established. This means that an unknown organism is to be initially sequenced, whereupon the sequence is calculated from the data of the sequencer, and an array is subsequently produced. By means of this array immediately an expression pattern can then be studied. In addition, it is known to produce protein arrays from known sequences. This production chain is very protracted and costly.
If it was possible to generate protein arrays directly as a derivative of a DNA array by means of a simple method, the coupling between the phenotype and genotype would be retained, and it would be possible to perform reactions on the derivative in a spatially resolved manner (antigen-antibodies, enzymatic reactions) and to associate them with the underlying DNA sequence.