The present invention relates to the application of data storage technology to molecular tracking and identification. In particular, combinations of matrix materials with programmable data storage or recording devices, herein referred to as memories, are provided. By virtue of this combination, molecules and biological particles, such as phage and viral particles and cells, that are in proximity to or in physical contact with the matrix combination can be electromagnetically tagged by programming the memory with data corresponding to identifying information. The molecules and biological particles can be identified by retrieving the stored data points. Combinations of matrix materials, memories, and linked or proximate molecules and biological materials are also provided. The combinations provided herein have a multiplicity of applications, including combinatorial chemistry, isolation and purification of target macromolecules, capture and detection of macromolecules for analytical purposes, high throughput screening, selective removal of contaminants, enzymatic catalysis, chemical modification and other uses. These combinations are particularly advantageous for use in multianalyte analyses.
There has been a convergence of progress in chemistry and biology. Among the important advances resulting from this convergence is the development of methods for generating molecular diversity and for detecting and quantifying small quantities of biological or chemical material. This advance been facilitated by fundamental developments in chemistry, including the development of highly sensitive analytical methods, solid state chemical synthesis, and sensitive and specific biological assay systems.
Analyses of biological interactions and chemical reactions, however, require the use of labels or tags to track and identify the results of such analyses. Typically biological reactions are monitored by radiolabels or direct or indirect enzyme labels. Chemical reactions are also monitored by direct or indirect means, such by linking the reactions to a second reaction in which a colored, fluorescent, chemiluminescent or other such product results. These analytical methods, however, are often time consuming and tedious. There is, thus, a need to develop alternative methods for tracking and identifying analytes in biological interactions and the reactants and products of chemical reactions.
Hybridization Reactions
For example, it is often desirable to detect or quantify very small concentrations of nucleic acids in biological samples. Typically, to perform such measurements, the nucleic acid in the sample [i.e., the target nucleic acid] is hybridized to a detection oligonucleotide. In order to obtain a detectable signal proportional to the concentration of the target nucleic acid, either the target nucleic acid in the sample or the detection oligonucleotide is associated with a signal generating reporter element, such as a radioactive atom, a chromogenic or fluorogenic molecule, or an enzyme [such as alkaline phosphatase] that catalyzes a reaction that produces a detectable product. Numerous methods are available for detecting and quantifying the signal.
Following hybridization of a detection oligonucleotide with a target, the resulting signal-generating hybrid molecules must be separated from unreacted target and detection oligonucleotides. In order to do so, many of the commonly used assays immobilize the target nucleic acids or detection oligonucleotides on solid supports. Presently available solid supports to which oligonucleotides are linked include nitrocellulose or nylon membranes, activated agarose supports, diazotized cellulose supports and non-porous polystyrene latex solid microspheres. Linkage to a solid support permits fractionation and subsequent identification of the hybridized nucleic acids, since the target nucleic acid may be directly captured by oligonucleotides immobilized on solid supports. More frequently, so-called xe2x80x9csandwichxe2x80x9d hybridization systems are used. These systems employ a capture oligonucleotide covalently or otherwise attached to a solid support for capturing detection oligonucleotide-target nucleic acid adducts formed in solution [see, e.g., EP 276,302 and Gingeras et al. (1989) Proc. Natl. Acad. Sci. USA 86:1173]. Solid supports with linked oligonucleotides are also used in methods of affinity purification. Following hybridization or affinity purification, however, if identification of the linked molecule or biological material is required, the resulting complexes or hybrids or compounds must be subjected to analyses, such as sequencing.
Immunoassays
Immunoassays also detect or quantify very small concentrations of analytes in biological samples. Many immunoassays utilize solid supports in which antigen or antibody is covalently, non-covalently, or otherwise, such as via a linker, attached to a solid support matrix. The support-bound antigen or antibody is then used as an analyte in the assay. As with nucleic acid analysis, the resulting antibody-antigen complexes or other complexes, depending upon the format used, rely on radiolabels or enzyme labels to detect such complexes.
The use of antibodies to detect and/or quantitate reagents [xe2x80x9cantigensxe2x80x9d] in blood or other body fluids has been widely practiced for many years. Two methods have been most broadly adopted. The first such procedure is the competitive binding assay, in which conditions of limiting antibody are established such that only a fraction [usually 30-50%] of a labeled [e.g., radioisotope, fluorophore or enzyme] antigen can bind to the amount of antibody in the assay medium. Under those conditions, the addition of unlabeled antigen [e.g., in a serum sample to be tested] then competes with the labeled antigen for the limiting antibody binding sites and reduces the amount of labeled antigen that can bind. The degree to which the labeled antigen is able to bind is inversely proportional to the amount of unlabeled antigen present. By separating the antibody-bound from the unbound labeled antigen and then determining the amount of labeled reagent present, the amount of unlabeled antigen in the sample [e.g., serum] can be determined.
As an alternative to the competitive binding assay, in the labeled antibody, or xe2x80x9cimmunometricxe2x80x9d assay [also known as xe2x80x9csandwichxe2x80x9d assay], an antigen present in the assay fluid is specifically bound to a solid substrate and the amount of antigen bound is then detected by a labeled antibody [see, e.g., Miles et al. (1968) Nature 29:186-189; U.S. Pat. Nos. 3,867,517; 4,376,110]. Using monoclonal antibodies two-site immunometric assays are available [see, e.g., U.S. Pat. No. 4,376,110]. The xe2x80x9csandwichxe2x80x9d assay has been broadly adopted in clinical medicine. With increasing interest in xe2x80x9cpanelsxe2x80x9d of diagnostic tests, in which a number of different antigens in a fluid are measured, the need to carry out each immunoassay separately becomes a serious limitation of current quantitative assay technology.
Some semi-quantitative detection systems have been developed [see, e.g., Buechler et al. (1992) Clin. Chem. 38:1678-1684; and U.S. Pat. No. 5,089,391] for use with immunoassays, but no good technologies yet exist to carefully quantitate a large number of analytes simultaneously [see, e.g., Ekins et al. (1990) J. Clin. Immunoassay 13:169-181] or to rapidly and conveniently track, identify and quantitate detected analytes.
Combinatorial Libraries
Drug discovery relies on the ability to identify compounds that interact with a selected target, such as cells, an antibody, receptor, enzyme, transcription factor or the like. Traditional drug discovery involves screening natural products form various sources, or random screening of archived synthetic material. The current trend, however, is to identify such molecules by rational design and/or by screening combinatorial libraries of molecules.
Methods and strategies for generating diverse libraries, primarily peptide- and nucleotide-based oligomer libraries, have been developed using molecular biology methods and/or simultaneous chemical synthesis methodologies [see, e.g., Dower et al. (1991) Annu. Rep. Med. Chem. 26:271-280; Fodor et al. (1991) Science 251:767-773; Jung et al. (1992) Angew. Chem. Ind. Ed. Engl. 31:367-383; Zuckerman et al. (1 992) Proc. Natl. Acad. Sci. USA 89:4505-4509; Scott et al. (1990) Science 249:386-390; Devlin et al. (1990) Science 249:404-406; Cwirla et al. (1990) Proc. Natl. Acad. Sci. USA 87:6378-6382; and Gallop et al. (1994) J. Medicinal Chemistry 37:1233-1251]. The resulting combinatorial libraries potentially contain millions of pharmaceutically relevant compounds and can be rapidly screened to identify compounds that exhibit a selected activity.
The libraries fall into roughly three categories: fusion-protein-displayed peptide libraries in which random peptides or proteins are presented on the surface of phage particles or proteins expressed from plasmids; support-bound synthetic chemical libraries in which individual compounds or mixtures of compounds are presented on insoluble matrices, such as resin beads [see, e.g., Lam et al. (1991) Nature 354:82-84] and cotton supports [see, e.g., Eichler et al. (1993) Biochemistry 32:11035-11041]; and methods in which the compounds are used in solution [see, e.g., Houghten et al. (1991) Nature 354:84-86, Houghten et al. (1992) BioTechniques 313:412-421; and Scott et al. (1994) Curr. Opin. Biotechnol. 5:40-48]. There are numerous examples of synthetic peptide and oligonucleotide combinatorial libraries. The present direction in this area is to produce combinatorial libraries that contain non-peptidic small organic molecules. Such libraries are based on either a basis set of monomers that can be combined to form mixtures of diverse organic molecules or that can be combined to form a library based upon a selected pharmacophore monomer.
There are three critical aspects in any combinatorial library: (i) the chemical units of which the library is composed; (ii) generation and categorization of the library, and (iii) identification of library members that interact with the target of interest, and keeping track of intermediary synthesis products and the multitude of molecules in a single vessel.
The generation of such libraries often relies on the use of solid phase synthesis methods, as well as solution phase methods, to produce combinatorial libraries containing tens of millions of compounds that can be screened in diagnostically or pharmacologically relevant in vitro assay systems. In generating large numbers of diverse molecules by stepwise synthesis, the resulting library is a complex mixture in which a particular compound is present at very low concentrations, so that it is difficult or impossible to determine its chemical structure. Various methods exist for ordered synthesis by sequential addition of particular moieties, or by identifying molecules based on spatial positioning on a chip. These methods are cumbersome and ultimately impossible to apply to highly diverse and large libraries.
Thus, an essential element of the combinatorial discovery process, as well as other areas in which molecules are identified and tracked, is the ability to extract the information made available during synthesis of the library or identification of the active components of intermediary structures. While there are several techniques for identification of intermediary products and final products, nanosequencing protocols that provide exact structures are only applicable on mass to naturally occurring linear oligomers such as peptides and amino acids. Mass spectrographic [MS] analysis is sufficiently sensitive to determine the exact mass and fragmentation patterns of individual synthesis steps, but complex analytical mass spectrographic strategies are not readily automated nor conveniently performed. Also, mass spectrographic analysis provides at best simple connectivity information, but no stereoisomeric information, and generally cannot discriminate among isomeric monomers. Another problem with mass spectrographic analysis is that it requires pure compounds; structural determinations on complex mixtures is either difficult or impossible. Finally, mass spectrographic analysis is tedious and time consuming. Thus, although there are a multitude of solutions to the generation of libraries, there are no ideal solutions to the problems of identification, tracking and categorization.
Similar problems arise in any screening or analytical process in which large numbers of molecules or biological entities are screened. In any system, once a desired molecule(s) has been isolated, it must be identified. Simple means for identification do not exist. Because of the problems inherent in any labeling procedure, it would be desirable to have alternative means for tracking and quantitating chemical and biological reactions during synthesis and/or screening processes.
Therefore, it is an object herein to provide methods for identification, tracking and categorization of the components of complex mixtures of diverse molecules.
Combinations of (i) a miniature recording device that contains one or more programmable data storage devices [memories] that can be remotely programmed and read; and (ii) a matrix, such as a particulate support used in chemical syntheses, are provided. The remote programming and reading is preferably effected using electromagnetic radiation.
The matrix materials [matrices] are any materials that are routinely used in chemical and biochemical synthesis. The matrix materials are typically polymeric materials that are compatible with chemical and biological syntheses and assays, and include, glasses, silicates, celluloses, polystyrenes, polysaccharides, sand, and synthetic resins and polymers, including acrylamides, particularly cross-linked polymers, cotton, and other such materials. The matrices may be in the form of particles or may be continuous in design, such as a test tube or microtiter plate or the like.
The recording device is a miniature device, typically less than 10 mm3 in size, preferably smaller, that includes at least one data storage unit that includes a remotely programmable and remotely readable, preferably non-volatile, memory. This device with remotely programmable memory is in proximity with or in contact with the matrix. In particular, the recording device includes a memory device, preferably having non-volatile memory means, for storing a plurality of data points and means for receiving a transmitted signal that is received by the device and for causing a data point corresponding to the data signal to be permanently stored within the memory means; and, if needed, a shell that is non-reactive with and impervious to any processing steps or solutions in which the combination of matrix with recording device is placed, and that is transmissive of read or write signals transmitted to the memory. The device may also include at least one support matrix disposed on an outer surface of the shell for retaining molecules or biological particles.
The recording device [containing the memory] is typically coated with at least one layer of material, such as a protective polymer or a glass, including polystyrene, heavy metal-free glass, plastic, ceramic, and may be coated with more than one layers of this and other materials. For example, it may be coated with a ceramic or glass, which is then coated with or linked to the matrix material. Alternatively, the glass or ceramic or other coating may serve as the matrix.
The data storage device or memory is programmed with or encoded with information that identifies molecules or biological particles, either by their process of preparation, their identity, their batch number, category, physical or chemical properties, combinations of any of such information, or other such identifying information. The molecules or biological particles are in physical contact, direct or indirect, or in proximity with the matrix, which in turn is in physical contact or in the proximity of the recording device that contains the data storage memory. Typically, the matrix is on the surface of the recording device and the molecules and biological particles are in physical contact with the matrix material.
The matrix combinations, thus, contain a matrix material, typically in particulate form, in physical contact with a tiny device containing one or more remotely programmable data storage units [memories]. Contact can be effected by placing the recording device with memory on or in the matrix material or in a solution that is in contact with the matrix material or by linking the device, either by direct or indirect covalent or non-co-valent interactions, chemical linkages or by other interactions, to the matrix.
For example, such contact is effected chemically, by chemically coupling the device with data storage unit to the matrix, or physically by coating the recording device with the matrix material or another material, by physically inserting or encasing the device in the matrix material, by placing the device onto the matrix or by any other means by which the device can be placed in contact with or in proximity to the matrix material.
Thus, combinations of a miniature recording device that contains or is a data storage unit linked to or in proximity with matrices or supports used in chemical and biochemical applications, such as combinatorial chemistry, peptide synthesis, nucleic acid synthesis, nucleic acid amplification methods, organic template chemistry, nucleic acid sequencing, screening for drugs, particularly high throughput screening, phage display screening, cell sorting, tracking of biological particles and other such methods, are provided. These combinations of matrix material with data storage unit [or recording device including the unit] are herein referred to as matrices with memories.
The matrices are either particulate of a size that is roughly 10 mm3 or smaller, typically 1 mm3 or smaller, or a continuous medium, such as a microtiter plate or well or plastic or other solid polymeric vial or glass vial. In instances in which the matrix is continuous, the data storage device [memory] may be placed in or on the matrix medium or may be embedded in the material of the matrix. More than one data storage device may be in proximity to or contact with a matrix particle. For example, microtiter plates with the recording device containing the data storage unit [remotely programmable memory] embedded in each well or vials [typically with a 1 ml or smaller capacity] with an embedded recording device, may be manufactured. In other embodiments, the memory device may be linked to or in proximity to more than one matrix particle.
The combination of matrix with memory is used by contacting it with, linking it to, or placing it in proximity with a molecule or biological particle, such as a virus or phage particle, a bacterium or a cell, to produce a second combination of a matrix with memory and a molecule or biological particle. In certain instances, such combinations of matrix with memory or combination of matrix with memory and molecule or biological particle may be prepared when used or may be prepared before use and packaged or stored as such for futures use.
Since matrix materials have many known uses in conjunction with molecules and biological particles, there are a multitude of methods known to artisans of skill in this art for linking, joining or physically contacting the molecule or biological particle with the matrix material. In some embodiments, the recording device with data storage unit is placed in a solution or suspension of the molecule or biological particle of interest. In such instances, the container, such as the microtiter plate or test tube or other vial, is the matrix material. The recording device is place in or on the matrix or can be embedded, encased or dipped in the matrix material.
The miniature recording device containing the data storage unit(s) with remotely programmable memory, includes, in addition to the remotely programmable memory, means for receiving information for storage in the memory and for retrieving information stored in the memory. Such means is typically an antenna, which also serves to provide power, that can be tuned to a desired electromagnetic frequency to program the memory. Preferred frequencies are any that do not substantially alter the molecular biological interactions of interest, such as those that are not substantially absorbed by the molecules or biological particles linked to the matrix or in proximity of the matrix, and that do not alter the support properties of the matrix. Radio frequencies are presently preferred, but other frequencies or optical lasers will be used, as long as the selected frequency or optical laser does not interfere with the interactions of the molecules or biological particles of interest. Thus, information in the form of data points corresponding to such information is stored in and retrieved from the data storage device by application of a selected electromagnetic radiation frequency.
The preferred miniature recording device for use in the combinations herein is a single substrate of a size preferably less than about 10 mm3, that includes a remotely programmable data storage unit(s) [memory], preferably a non-volatile memory, and an antenna for receiving or transmitting an electromagnetic signal, preferably a radio frequency signal; the antenna, memory and other components are preferably provided on a single substrate, thereby minimizing the size of the device. The device is preferably smaller than 10 mm3 in volume, more preferably less than 5 mm3, most preferably about 1 mm3 or smaller, and is rapidly programmable, preferably in less than 5 seconds, more preferably in about 1 second, and most preferably in about 1 millisecond or less. The preferred memory is non-volatile, permanent, and relies on antifuse circuitry.
Containers, such as vials, tubes, microtiter plates, and the like, which are in contact with a recording device that contains a data storage unit with programmable memory are also provided. The container is typically of a size used in immunoassays or hybridization reactions, generally a liter or less, typically less than 100 ml, and often less than about 10 ml in volume. Alternatively the container can be in the form of a plurality of wells, such as a microtiter plate, each well having about 1 ml or less in volume. The container is transmissive to the electromagnetic radiation, such as radio frequencies, infrared wavelengths, ultraviolet wavelengths, microwave frequencies, visible wavelengths, X-rays or laser light, used to program the recording device.
Methods for electromagnetically tagging molecules or biological particles are provided. Such tagging is effected by placing the molecules or biological particles of interest in proximity with the recording device or with the matrix with memory, and programming or encoding the identity of the molecule or synthetic history of the molecules or batch number or other identifying information into the memory. The identified molecule or biological particle is then used in the reaction or assay of interest and tracked by virtue of its linkage to the matrix with memory or its proximity to the matrix with memory, which can be queried to identify the molecule or biological particle.
In particular, methods for tagging constituent members of combinatorial libraries and other libraries or mixtures of diverse molecules and biological particles are provided. These methods involve electromagnetically tagging molecules, particularly constituent members of a library, by contacting the molecules or biological particles or bringing such molecules or particles into proximity with a matrix with memory and programming the memory with retrievable information from which the identity, synthesis history, batch number or other identifying information can be retrieved. The contact is preferably effected by coating, completely or in part, the recording device with memory with the matrix and then linking, directly or via linkers, the molecule or biological particle of interest to the matrix support. The memories can be coated with a protective coating, such as a glass or silicon, which can be readily derivatized for chemical linkage or coupling to the matrix material. In other embodiments, the memories can be coated with matrix, such as for example dipping the memory into the polymer prior to polymerization, and allowing the polymer to polymerize on the surface of the memory.
If the matrices are used for the synthesis of the constituent molecules, the memory of each particle is addressed and the identity of the added component is encoded in the memory at [before, during, or preferably ably after] each step in the synthesis. At the end of the synthesis, the memory contains a retrievable record of all of the constituents of the resulting molecule, which can then be used, either linked to the support, or following cleavage from the support in an assay or for screening or other such application. If the molecule is cleaved from the support with memory, the memory must remain in proximity to the molecule or must in some manner be traceable to the molecule. Such synthetic steps may be automated.
In preferred embodiments, the matrix with memory with linked molecules [or biological particles] are mixed and reacted with a sample according to a screening or assay protocol, and those that react are isolated. The identity of reacted molecules can then be ascertained by remotely retrieving the information stored in the memory and decoding it to identify the linked molecules.
Compositions containing combinations of matrices with memories and compositions of matrices with memories and molecules or biological particles are also provided. In particular, coded or electronically tagged libraries of oligonucleotides, peptides, proteins, non-peptide organic molecules, phage display, viruses and cells are provided. Particulate matrices, such as polystyrene beads, with attached memories, and continuous matrices, such as microtiter plates or slabs, with a plurality of embedded or attached memories are provided.
These combinations of matrix materials with memories and combinations of matrices with memories and molecules or biological particles may be used in any application in which support-bound molecules or biological particles are used. Such applications include, but are not limited to diagnostics, such as immunoassays, drug screening assays, combinatorial chemistry protocols and other such uses. These matrices with memories can be used to tag cells for uses in cell sorting, to identify molecules in combinatorial syntheses, to label monoclonal antibodies, to tag constituent members of phage displays, in affinity separation procedures, to label DNA and RNA, in nucleic acid amplification reactions [see, e.g., U.S. Pat. Nos. 5,403,484; 5,386,024; 4,683,202 and, for example International PCT Application WO/94 02634, which describes the use of solid supports in connection with nucleic acid amplification methods], to label known compounds, particularly mixtures of known compounds in multianalyte analyses], to thereby identify unknown compounds, or to label or track unknowns and thereby identify the unknown by virtue of reaction with a known. Thus, the matrices with memories are particularly suited for high throughput screening applications and for multianalyte analyses.
Systems and methods for recording and reading or retrieving the information in the data storage devices regarding the identity or synthesis of the molecules or biological particles are also provided. The systems for recording and reading data include: a host computer or other encoder/decoder instrument having a memory for storing data relating to the identity or synthesis of the molecules, and a transmitter means for receiving a data signal and generating a signal for transmitting a data signal; and a recording device that includes a remotely programmable, preferably non-volatile, memory and transmitter means for receiving a data signal and generating at least a transmitted signal and for providing a write signal to the memory in the recording device.
In particular, the systems include means for writing to and reading from the memory device to store and identify each of the indicators that identify or track the molecules and biological particles. The systems additionally include the matrix material in physical contact with or proximate to the recording device, and may also include a device for separating matrix particles with memory so that each particle or memory can be separately programmed.
Methods for tagging molecules and biological particles by contacting, either directly or indirectly, a molecule or biological particle with a recording device; transmitting from a host computer or decoder/encoder instrument to the device electromagnetic radiation representative of a data signal corresponding to an indicator that either specifies one of a series of synthetic steps or the identity or other information for identification of the molecule or biological particle, whereby the data point representing the indicator is written into the memory, are provided.
Methods for reading identifying information from recording devices linked to or in contact with or in proximity to a electromagnetically tagged molecule or electromagnetically tagged biological particles are provided. These methods include the step of exposing the recording device containing the memory in which the data is stored to electromagnetic radiation [EM]; and transmitting to a host computer or decoder/encoder instrument an indicator representative of a the identity of a molecule or biological particle or identification of the molecule or biological particle linked to or in proximity to the recording device.
One, two, three and N-dimensional arrays of the matrices with memories are also provided. Each memory is programmed with its position in the array. Such arrays may be used for blotting, if each matrix particle is coated on one at least one side with a suitable material, such as nitrocellulose. For blotting, each memory is coated on at least one side with the matrix material and arranged contiguously to adjacent memories to form a substantially continuous sheet. After blotting, the matrix particles may be separated and reacted with the analyte of interest, after which the physical position of the matrices to which analyte binds may be determined. The amount of bound analyte may also be quantified. Southern, Northern, Western and dot blot assays using such arrays are provided.
Immunoassays, such as enzyme linked immunosorbent assays [ELISAs] in which at least one analyte is linked to a solid support matrix that is combined with a recording device containing a data storage unit with a programmable, preferably remotely programmable and non-volatile, memory are provided.
Molecular libraries, such as phage display libraries, DNA libraries, in which the constituent molecules are combined with a solid support matrix that is combined with a data storage unit with a programmable memory are provided.
Affinity purification protocols in which the affinity resin is combined with a recording device containing a data storage unit with a programmable memory are also provided.