Automated identification of articles using bar codes in the availability of the integrated circuit technology and computing power at reasonable costs. Such codes are typically used to track and identify consumer goods and other articles of manufacture. One of the first scanners capable of reading a bar code was installed at a supermarket in 1974, and by 1980 more than 90% of all grocery items carried a bar code by 1980. By December 1985, more than 12,000 grocery stores were equipped with scanner checkout systems [See, e.g., Harmon et al. (1989) Reading Between the Lines-An Introduction to Bar Code Technology, Helmers Publishing, Inc. 1989]. Bar codes have also been used in other applications, including other inventory control systems and for identification and characterization of responses to mass advertising efforts.
By electro-optically scanning the symbol on an item and generating a corresponding signal, it is possible in an associated computer whose memory has digitally stored therein the full range of items, to compare the signal derived from the scanned symbol with the stored information. When a match is found, the identity of the item and associated information, such as, in the instance of consumer goods, its price. Thus computer technology is exploited to facilitate identification procedures using machine-readable identifiers.
Bar codes are typically read using lasers that scan from left to right, right to left, or in both directions (or other directions) across a field of alternating dark bars and reflective spaces of varying widths. Multiple scans are typically employed to minimize data errors. Because of the multiplicity of bars and spaces required for each alphanumeric character, bar codes generally require a relatively large space to convey a small amount of data. For instance, each character in the bar code system known as Code 39 requires five bars and four spaces. A high density Code 39 field corresponds to only 9.4 characters per inch. Universal Product Codes (UPCs) are another common bar code used primarily in the retail grocery trade and contain a relatively large number of bars and spaces which allow for error checking, parity checking and reduction of errors caused by manual scanning of articles in grocery stores. They accordingly require even larger space for conveyance of character information. The Codabar code, which has been developed by Pitney Bowes and is used in retail price labeling systems and by Federal Express, is a self-checking code. Each character is represented by a stand-alone group of four bars and three interleaving spaces. Federal Express uses an eleven digit Codabar symbol on each airbill to process more than 450,000 packages per night. Other codes use varying bar and space techniques to represent characters. Because of error checking requirements and for other reasons, however, the space required to place a bar code on an article is relatively large.
In addition to the large surface area required for the series of bars and spaces that form a typical bar code symbol, the code must be placed on a background that has a high reflectance level. The high level of contrast, or reflectivity ratio, between the dark bars and the reflective spaces, allows the optical sensor in the reader to discern clearly and dependably the transitions between the bars and spaces in the symbol. Ideally, the printed bar should be observed as perfectly black and the spaces should be perfectly reflective. Because those ideal conditions are seldom possible, the industry typically requires that labeling media reflect at least 70% of incident light energy. Surface reflectivity and thus quality of the media on which the bar code is placed directly affects the successful use of the bar code on that media. Additionally, the media cannot be overly transparent or translucent, since those characteristics can attenuate reflected light. Accordingly, only limited types of highly reflective media may be used for placement of bar codes. Space requirements for bar codes further include a "quiet zone" that surrounds the field of bars and spaces. In many codes, this quiet zone constitutes a border around the code symbol, thus requiring even more space for the bar code.
Bar coding also requires very precise print methods. Assuming that the printing operation is capable of printing the required density to achieve the 70% reflectance ratio, careful attention must be paid to additional major factors that influence the bar code effectiveness. Those include ink spread/shrinkage; ink voids/specks; ink smearing; non-uniformity of ink; bar/space width tolerances; edge roughness and similar factors that must be closely controlled to ensure that the symbol will be easily scannable. In other words, the printer must pay careful attention to using paper or other media that displays the correct absorption properties properly inking the ribbon; carefully controlling hammer pressure; keeping the printhead and paper clean; properly wetting the paper and curing the ink; and maintaining proper adjustment of the printhead control mechanism. These printing details create additional problems and expenses, particularly for placement of bar code symbols on smaller items such as coupons and mail pieces.
"Bar codes" containing an array of marks of any desired size and shape that are arranged in a reference context or frame of one or more columns and one or more rows, together with a reference marker and a reference cue have also been developed [see, U.S. Pat. No. 5,128,528]. The number of rows corresponds to the number of characters contained in the symbology selected for the array. For example, an array that is capable of conveying all the letters of the English language and ten numeral symbols could use 36 rows. The number of columns in the matrix could corresponds to the number of characters desired to be conveyed. The roles of the rows and columns in the reference frame may be reversed if desired. In the preferred embodiment, each column contains one or more dots corresponding to the character which is desired to be conveyed in that column. The reference marker and reference cue may be formed of one shape, of two marks, or according to any other desired arrangement that allows interpretation of the matrix at any desired attitude with respect to the imaging equipment. The reference cue may form a part of the reference marker, or an information dot, if desired.
Thus, there are numerous types of bar codes, codes and methodologies for use available. Bar coding and other coding technology, however, remains to be fully exploited in areas outside the consumer products domain.
Drug Discovery
Drug discovery relies on the ability to identify compounds that interact with a selected target, such as cells, an antibody, receptor, enzyme, transcription factor or the like. Traditional drug discovery relied on collections or "libraries" obtained from proprietary databases of compounds accumulated over many years, natural products, fermentation broths, and rational drug design. Recent advances in molecular biology, chemistry and automation have resulted in the development of rapid, High throughput screening (HTS) protocols to screen these collection. In connection with HTS, methods for generating molecular diversity and for detecting, identifying and quantifying biological or chemical material have been developed. These advances have been facilitated by fundamental developments in chemistry, including the development of highly sensitive analytical methods, solid state chemical synthesis, and sensitive and specific biological assay systems.
Analyses of biological interactions and chemical reactions, however, require the use of labels or tags to track and identify the results of such analyses. Typically biological reactions, such as binding, catalytic, hybridization and signaling reactions, are monitored by labels, such as radioactive, fluorescent, photoabsorptive, luminescent and other such labels, or by direct or indirect enzyme labels. Chemical reactions are also monitored by direct or indirect means, such as by linking the reactions to a second reaction in which a colored, fluorescent, chemiluminescent or other such product results. These analytical methods, however, are often time consuming, tedious and, when practiced in vivo, invasive. In addition, each reaction is typically measured individually, in a separate assay. There is, thus, a need to develop alternative and convenient methods for tracking and identifying analytes in biological interactions and the reactants and products of chemical reactions.
Combinatorial Libraries
The provision and maintenance of compounds to support HTS have become critical. New and innovative methods for the lead generation and lead optimization have emerged to address this need for diversity. Among these methods is combinatorial chemistry, which has become a powerful tool in drug discovery and materials science. Methods and strategies for generating diverse libraries, primarily peptide- and nucleotide-based oligomer libraries, have been developed using molecular biology methods and/or simultaneous chemical synthesis methodologies [see, e.g., Dower et al. (1991) Annu. Rep. Med. Chem. 26:271-280; Fodor et al. (1991) Science 251:767-773; Jung et al. (1992) Angew. Chem. Ind. Ed. Engl. 31:367-383; Zuckerman et al. (1992) Proc. Natl. Acad. Sci. USA 89:4505-4509; Scott et al. (1990) Science 249:386-390; Devlin et al. (1990) Science 249:404-406; Cwirla et al. (1990) Proc. Natl. Acad. Sci. USA 87:6378-6382; and Gallop et al. (1994) J. Medicinal Chemistry 37:1233-1251]. The resulting combinatorial libraries potentially contain millions of pharmaceutically relevant compounds and that can be screened to identify compounds that exhibit a selected activity.
The libraries fall into roughly three categories: fusion-protein-displayed peptide libraries in which random peptides or proteins are presented on the surface of phage particles or proteins expressed from plasmids; support-bound synthetic chemical libraries in which individual compounds or mixtures of compounds are presented on insoluble matrices, such as resin beads [see, e.g., Lam et al. (1991) Nature 354:82-84] and cotton supports [see, e.g., Eichler et al. (1993) Biochemistry 32:11035-11041]; and methods in which the compounds are used in solution [see, e.g., Houghten et al. (1991) Nature 354:84-86, Houghten et al. (1992) BioTechniques 313:412-421; and Scott et al. (1994) Curr. Opin. Biotechnol. 5:40-48]. There are numerous examples of synthetic peptide and oligonucleotide combinatorial libraries. The present direction in this area is to produce combinatorial libraries that contain non-peptidic small organic molecules. Such libraries are based on either a basis set of monomers that can be combined to form mixtures of diverse organic molecules or that can be combined to form a library based upon a selected pharmacophore monomer.
There are three critical aspects in any combinatorial library: (i) the chemical units of which the library is composed; (ii) generation and categorization of the library, and (iii) identification of library members that interact with the target of interest, and tracking intermediary synthesis products and the multitude of molecules in a single vessel.
The generation of such libraries often relies on the use of solid phase synthesis methods, as well as solution phase methods, to produce collections containing tens of millions of compounds that can be screened in diagnostically or pharmacologically relevant in vitro assay systems. In generating large numbers of diverse molecules by stepwise synthesis, the resulting library is a complex mixture in which a particular compound is present at very low concentrations, so that it is difficult or impossible to determine its chemical structure. Various methods exist for ordered synthesis by sequential addition of particular moieties, or by identifying molecules based on special positioning on a chip. These methods are cumbersome and ultimately impossible to apply to highly diverse and large libraries. Identification of library members that interact with a target of interest, and tracking intermediary synthesis products and the multitude of molecules in a single vessel is also a problem.
High Throughput Screening
In addition, exploitation of this diversity requires development of methods for rapidly screening compounds. Advances in instrumentation, molecular biology and protein chemistry and the adaptation of biochemical activity screens into microplate formats, has made it possible to screen of large numbers of compounds. Also, because compound screening has been successful in areas of significance for the pharmaceutical industry, high throughput screening (HTS) protocols have assumed importance. Presently, there are hundreds of HTS systems operating throughout the world, which are used, not only for compound screening for drug discovery, but also for immunoassays, cell-based assays and receptor-binding assays.
An essential element of high throughput screening for drug discovery process and areas in which molecules are identified and tracked, is the ability to extract the information made available during synthesis and screening of a library, identification of the active components of intermediary structures, and the reactants and products of assays. While there are several techniques for identification of intermediary products and final products, nanosequencing protocols that provide exact structures are only applicable on mass to naturally occurring linear oligomers such as peptides and amino acids. Mass spectrographic [MS] analysis is sufficiently sensitive to determine the exact mass and fragmentation patterns of individual synthesis steps, but complex analytical mass spectrographic strategies are not readily automated nor conveniently performed. Also, mass spectrographic analysis provides at best simple connectivity information, but no stereoisomeric information, and generally cannot discriminate among isomeric monomers. Another problem with mass spectrographic analysis is that it requires pure compounds; structural determinations on complex mixtures is either difficult or impossible. Finally, mass spectrographic analysis is tedious and time consuming. Thus, although there are a multitude of solutions to the generation of libraries and to screening protocols, there are no ideal solutions to the problems of identification, tracking and categorization.
These problems arise in any screening or analytical process in which large numbers of molecules or biological entities are screened. In any system, once a desired molecule(s) has been isolated, it must be identified. Simple means for identification do not exist. Because of the problems inherent in any labeling procedure, it would be desirable to have alternative means for tracking and quantitating chemical and biological reactions during synthesis and/or screening processes, and for automating such tracking and quantitating.
Therefore, it is an object herein to provide methods for identification, tracking and categorization of the components of complex mixtures of diverse molecules. It is also an object herein to provide products for such identification, tracking and categorization and to provide assays, diagnostics and screening protocols that use such products. It is of particular interest herein to provide means to track and identify compounds and to perform HTS protocols.