Array assays between surface bound binding agents or probes and target molecules in solution may be used to detect the presence of particular biopolymeric analytes in the solution. The surface-bound probes may be oligonucleotides, peptides, polypeptides, proteins, antibodies or other molecules capable of binding with target biomolecules in the solution. Such binding interactions are the basis for many of the methods and devices used in a variety of different fields, e.g., genomics (in sequencing by hybridization, SNP detection, differential gene expression analysis, identification of novel genes, gene mapping, finger printing, etc.) and proteomics.
One typical array assay method involves biopolymeric probes immobilized in an array on a substrate such as a glass substrate or the like. A solution containing target molecules (“targets”) that bind with the attached probes is placed in contact with the bound probes under conditions sufficient to promote binding of targets in the solution to the complementary probes on the substrate to form a binding complex that is bound to the surface of the substrate. The pattern of binding by target molecules to probe features or spots on the substrate produces a pattern, i.e., a binding complex pattern, on the surface of the substrate which is detected. This detection of binding complexes provides desired information about the target biomolecules in the solution.
The binding complexes may be detected by reading or scanning the array with, for example, optical means, although other methods may also be used, as appropriate for the particular assay. For example, laser light may be used to excite fluorescent labels attached to the targets, generating a signal only in those spots on the array that have a labeled target molecule bound to a probe molecule. This pattern may then be digitally scanned for computer analysis. Such patterns can be used to generate data for biological assays such as the identification of drug targets, single-nucleotide polymorphism mapping, monitoring samples from patients to track their response to treatment, assessing the efficacy of new treatments, etc.
Biopolymer arrays can be fabricated using either deposition of the previously obtained biopolymers or in situ synthesis methods. The deposition methods basically involve depositing biopolymers at predetermined locations on a substrate which are suitably activated such that the biopolymers can link thereto. Biopolymers of different sequence may be deposited at difference regions on the substrate to yield the completed array. Typical procedures known in the art for deposition of previously obtained polynucleotides, particularly DNA, such as whole oligomers or cDNA, are to load a small volume of DNA in solution in one or more drop dispensers such as the tip of a pin or in an open capillary and, touch the pin or capillary to the surface of the substrate. Such a procedure is described in U.S. Pat. No. 5,807,522. When the fluid touches the surface, some of the fluid is transferred. The pin or capillary must be washed prior to picking up the next type of DNA for spotting onto the array. This process is repeated for many different sequences and, eventually, the desired array is formed. Alternatively, the DNA can be loaded into a drop dispenser in the form of a pulse jet head and fired onto the substrate. Such a technique has been described in WO 95/25116 and WO 98/41531, and elsewhere.
The in situ synthesis methods include those described in U.S. Pat. No. 5,449,754 for synthesizing peptide arrays, as well as WO 98/41531 and the references cited therein for synthesizing polynucleotides (specifically, DNA) using phosphoramidite or other chemistry. Additional patents describing in situ nucleic acid array synthesis protocols and devices include U.S. Pat. Nos. 6,451,998; 6,446,682; 6,440,669; 6,420,180; 6,372,483; 6,323,043; and 6,242,266; the disclosures of which patents are herein incorporated by reference.
Such in situ synthesis methods can be basically regarded as iterating the sequence of depositing droplets of: (a) a protected monomer onto predetermined locations on a substrate to link with either a suitably activated substrate surface (or with previously deposited deprotected monomer); (b) deprotecting the deposited monomer so that it can react with a subsequently deposited protected monomer; and (c) depositing another protected monomer for linking. Different monomers may be deposited at different regions on the substrate during any one cycle so that the different regions of the completed array will carry the different biopolymer sequences as desired in the completed array. One or more intermediate further steps may be required in each iteration, such as oxidation and washing steps.
With respect to in situ preparation of nucleic acid arrays, in many currently employed protocols successive layers are built up, 3′ to 5′, by pulse-jet depositing an appropriate nucleotide phosphoramidite and an activator to each array feature location of a substrate surface, e.g., a glass wafer surface. The substrate is then removed to a flow cell, and the other phosphoramidite cycle steps (oxidation and deprotection of the 5′-hydroxyl group) are performed in parallel. The substrate is then re-registered, and the next layer is printed.
One exemplary in situ nucleic acid array synthesis device and method for its use is shown in FIG. 1. The printing process is shown diagrammatically in FIGS. 1A and 1B. A detailed view of the printhead is shown in FIG. 2. FIG. 1A shows the relationship between the printing head and the wafer (in this case, a 6″ square, capable of yielding 12 1″×3″ slides). In what follows, the stage X direction is referred to as “columns” and the direction orthogonal to the stage X direction is referred to as “rows”. During printing, the inkjet is stationary, and a stepping stage moves the wafer over the head in the X direction. As the wafer passes over the head, it prints the appropriate phosphoramidite (dG, dT, dC or dA; see FIG. 2) to each feature. Since there are 20 nozzles dispensing each chemical component (numbered 1 through 20 in FIG. 2), the array is printed 20 columns at a time, and each column can be mapped back to a particular nozzle in each well (see FIG. 2). Thus, each column on the final array is associated with a nozzle group with 4 members. The nozzle groups have a periodicity of 20, i.e. columns 1, 21, 41, 61, etc. are all written by the same nozzle group.
Nucleic acid microarrays present a unique challenge to Quality Control (QC) professionals. Microarrays conduct thousands to tens of thousands of quantitative measurements in parallel. However, absolute standard samples exist for few or none of these measurements, and there is as of yet no organization that is seriously attempting to develop such standards. To date, microarray manufacturers have approached this problem in two ways:                1. Representative QC: Manufacturing campaigns include a certain percentage of arrays of a fixed design that are specifically used for QC purposes only. Following hybridization to a specified sample, via a specified protocol, this Representative QC array must then produce data that meet certain pre-determined quantitative standards.        2. Embedded QC: Every array, regardless of design, includes a specific subset of probes that sample the array landscape (i.e. QC probes are “embedded” in each design). These probes must produce data that meet certain quantitative standards, after hybridization to a specified sample, via a specified protocol. Furthermore, in order to provide metrics of data quality during customer use, QC sample(s) targeted to these probes may potentially be included in customer samples.        
Representative QC can be used with either custom array designs (i.e. manufacturing campaigns may contain arrays with a variety of relatively uncharacterized designs) or catalog array designs (i.e. manufacturing campaigns contain arrays of only one well-characterized design). Since the Representative QC design is independent of Custom or Catalog array designs, it can be used to assess the quality of production runs of custom designs or catalog array designs and hybridized independently under optimized conditions.
Embedded QC, however, works best with catalog arrays (unless custom array designs are required to contain the embedded grid as well). The advantage of using embedded QC is that it is more efficient and robust, since any array can be sampled for QC purposes. Also embedded QC data can be collected at the point of use or from a pre-release sample, whereas representative QC is always performed before release to the end user. Therefore, embedded QC results will be more representative of the quality of the product since any degradation that occurred after the product was released to the end user (shipping, storage, handling, etc.) will be detectable. However, the disadvantage compared to Representative QC is that it is limited to a fixed budget of features on the array that can be employed for QC purposes. Therefore, it is incumbent upon the Embedded QC designer, to be fastidious, and where possible, assign multiple Embedded QC tasks for a fixed number of probes.
An important function of QC is to sample the performance of each nozzle group, in a relatively uniform manner. One approach to realizing this goal has included the use of “gridline control probes”, which are constructed by synthesizing a set of control 25-mer oligonucleotide probes, either directly on the array surface, on top of (i.e. 5′ of) a 20-mer oligonucleotide tether, or on top of (i.e. 5′ of) a 35-mer oligonucleotide tether. These probes, when exposed to their common complementary labeled 25-mer target in solution, sample all layers of oligonucleotide synthesis on a 60-mer in situ oligonucleotide microarray. Layers 1-20, 26-35 and 46-60 are sampled once, while layers 21-25 and 36-45 may be sampled twice, due to probe overlaps, if the overlapping probes utilize different bases at that layer.
Such gridline probes have found extensive use in Representative QC array fabrication protocols. However, these probes have two disadvantages:                1. In order to sample all of the inkjet nozzles involved in printing an array, the present gridline probes are laid out in continuous rows on the Representative QC array. In order to obtain good sampling of the array surface, many such rows are laid out. The net result is that the gridline probes use up over 60% of the available array features.        2. The single 25-mer control probe sequence shared by the gridline probes samples only 1-2 bases per layer (1 base in regions of no overlap, 2 bases when a layer is sampled by 2 probes that overlap at that layer and utilize a different base at that layer). If printing for an unsampled base fails, the failure is invisible to the gridline probes.        
In in situ nucleic acid array synthesis protocols, there is an interest in the development of an Embedded QC probe design approach and layout scheme that enables highly efficient sampling of the work product of several aspects of the in situ nucleic acid array printing process. Of particular interest would be the development of such an approach that enabled the efficient detection of several known array manufacturing error modes via a relatively small number of probes. The present invention satisfies this need.