In general, the invention relates to diagnostic methods involving multiplex analysis.
A variety of methods exist to detect multiple species in a biological sample. These include ELISA based immunoabsorbent assays, protein biochips, and the like. Each of these methods suffers from limitations in detection sensitivity or selectivity, due, for example, to kinetics of binding or sensitivity of detection reagents. In addition, these techniques are also limited in terms of the number of molecules that can be rapidly detected.
The present invention involves a novel multiplex diagnostic approach for the ultra-sensitive detection of molecules in biological samples.
In general, in a first aspect, the invention features a method for detecting multiple compounds in a sample, the method involving: (a) contacting the sample with a mixture of binding reagents, the binding reagents being nucleic acid-protein fusions, each having (i) a protein portion which is known to specifically bind to one of the compounds and (ii) a nucleic acid portion which encodes the protein portion and which includes a unique identification tag; (b) allowing the protein portions of the binding reagents and the compounds to form complexes; (c) capturing the binding reagent-compound complexes; (d) amplifying the nucleic acid portions of the complexed binding reagents; and (e) detecting the unique identification tag of each of the amplified nucleic acids, thereby detecting the corresponding compounds in the sample.
In preferred embodiments, the sample is a biological sample; the nucleic acid-protein fusion is an RNA-protein fusion; the nucleic acid-protein fusion is covalently bound; the nucleic acid-protein fusion is covalently bound through a peptide acceptor; the peptide acceptor is puromycin; the binding reagents do not bind the compounds through compound-specific antibody domains; each of the binding reagents includes a scaffold domain; each of the binding reagents includes a fibronectin scaffold domain; the fibronectin scaffold domain is the 10th domain of fibronectin type III; each of the binding reagents includes an antibody scaffold domain; the binding reagents bind the compounds with equilibrium constants of less than about 500 nM; the unique identification tags are detected using a solid support to which are immobilized nucleic acids specific for the unique identification tags and the detection is accomplished by hybridization of the unique identification tags to the immobilized nucleic acids; the amplifying step (d) is carried out using quantitative PCR; the compounds are proteins; the mixture of binding reagents includes at least 5 different nucleic acid-protein fusions, each specifically binding to a different compound; the mixture of binding reagents includes at least 100 different nucleic acid-protein fusions, each specifically binding to a different compound; the mixture of binding reagents includes at least 40,000 different nucleic acid-protein fusions, each specifically binding to a different compound; and/or the mixture of binding reagents includes at least 500,000 different nucleic acid-protein fusions, each specifically binding to a different compound.
In a second aspect, the invention features a method for detecting a compound in a sample, the method involving: (a) contacting the sample with a binding reagent, the binding reagent being a nucleic acid-protein fusion having (i) a protein portion which is known to specifically bind to the compound and (ii) a nucleic acid portion which encodes the protein portion and which includes a unique identification tag; (b) allowing the protein portion of the binding reagent and the compound to form a complex; (c) capturing the binding reagent- compound complex; (d) amplifying the nucleic acid portion of the complexed binding reagent; and (e) detecting the unique identification tag of the amplified nucleic acid, thereby detecting the corresponding compound in the sample.
In a related aspect, the invention features a kit for carrying out compound detection, the kit including: (a) a nucleic acid-protein fusion, wherein the protein portion of the fusion specifically binds the compound and the nucleic acid portion of the fusion encodes the protein portion and includes a unique identification tag; (b) a PCR primer pair, wherein the first of the primers hybridizes to the nucleic acid portion of the fusion 5xe2x80x2 to the unique identification tag and the second of the primers hybridizes to the nucleic acid portion of the fusion 3xe2x80x2 to the unique identification tag and hybridization of the primers to the nucleic acid fusion permits amplification of the unique identification tag; and (c) a solid support including a nucleic acid which can hybridize to the unique identification tag.
In preferred embodiments, the kit further includes Taq polymerase; the nucleic acid-protein fusion is an RNA-protein fusion; the nucleic acid-protein fusion is covalently bound; the nucleic acid-protein fusion is covalently bound through a peptide acceptor; the peptide acceptor is puromycin; the nucleic acid- protein fusion does not bind the compound through a compound-specific antibody domain; the nucleic acid-protein fusion includes a scaffold domain; the nucleic acid-protein fusion includes a fibronectin scaffold domain; the fibronectin scaffold domain is the 10th domain of fibronectin type III; the nucleic acid-protein fusion includes an antibody scaffold domain; the nucleic acid-protein fusion binds the compound with an equilibrium constant of less than about 500 nM; the solid support is a chip; the solid support includes an ordered array of single-stranded nucleic acids on its surface, each of the single-stranded nucleic acids being capable of hybridizing to a different unique identification tag; the compound is a protein; the kit includes at least 5 different nucleic acid-protein fusions, each specifically binding to a different compound; the kit includes at least 100 different nucleic acid-protein fusions, each specifically binding to a different compound; the kit includes at least 40,000 different nucleic acid-protein fusions, each specifically binding to a different compound; and/or the kit includes at least 500,000 different nucleic acid-protein fusions, each specifically binding to a different compound.
According to this approach, one begins with a set of uniquely defined high affinity binding reagents (typically protein binding reagents). Each of these reagents binds to a different target in a sample, facilitating the detection of several targets simultaneously. The targets of the binding reagents are frequently proteins, but they may be any moiety capable of specific binding, including, for example, nucleic acids or sugar moieties. Such binding reagents may represent naturally-occurring or partially or completely synthetic amino acid sequences. Examples of naturally-occurring binding reagents include, without limitation, members of the following binding pairs: antigen/antibody pairs, protein/inhibitor pairs, receptor/ligand pairs (for example cell surface receptor/ligand pairs, such as hormone receptor/peptide hormone pairs), enzyme/substrate pairs (for example, kinase/substrate pairs), lectin/carbohydrate pairs, oligomeric or heterooligomeric protein aggregates, DNA binding protein/DNA binding site pairs, RNA/protein pairs, and nucleic acid duplexes, heteroduplexes, or ligated strands, as well as any molecule which is capable of forming one or more covalent or non-covalent bonds (for example, disulfide bonds) with any portion of a nucleic acid-protein fusion. In addition to naturally-occurring binding partner members, binding reagents may be derived by any technique, for example, by directed evolution approaches using a desired protein as the binding target.
Whether naturally-occurring or synthetic, when mixtures of binding reagents are utilized in a single diagnostic reaction mixture, they are preferably similar in composition and amino acid length. In a particularly preferred approach, one starts with a common amino acid scaffold or structural motif that displays the binding domain on one face of the molecule, as is the case for an antibody scaffold that displays CDR regions as a binding region of the molecule. A particularly useful binding scaffold is the 10th domain of type III fibronectin (see, for example, Lipovsek et al., Protein Scaffolds for Antibody Mimics and Other Binding Proteins, U.S. Ser. No. 09/456,693; U.S. Ser. No. 09/515,260; U.S. Ser. No. 09/688,566; WO 00/34784).
The compound to be detected by the present approach may be any substance to which a protein may bind, and is preferably itself a protein. Such target compounds may be present in any sample, for example, any biological sample. Typical biological samples include, without limitation, any fluid or tissue derived from an organism, for example, a plant or a mammal such as a human.
As a central feature of the invention, each of the binding domains is covalently attached to a nucleic acid that encodes the binding domain. Such nucleic acid-protein fusion molecules can be produced by any method, for example, by the method of Roberts and Szostak (Szostak et al., U.S. Ser. NO. 09/007,005, now U.S. Pat. No. 6.258,558 B1, and U.S. Ser. No. 09/247,190, now U.S. Pat. No. 6,261,804 B1; Szostak et al., WO 98/31700; Roberts and Szostak, Proc. Natl. Acad. Sci. USA (1997) vol. 94, p. 12297-12302) using a peptide acceptor, such as puromycin, as a covalent linking agent. As used herein, by a xe2x80x9cpeptide acceptorxe2x80x9d is meant any molecule capable of being added to the C- terminus of a growing protein chain by the catalytic activity of the ribosomal peptidyl transferase function. Typically, such molecules contain (i) a nucleotide or nucleotide-like moiety (for example, adenosine or an adenosine analog (di- methylation at the N-6 amino position is acceptable)), (ii) an amino acid or amino acid-like moiety (for example, any of the 20 D- or L-amino acids or any amino acid analog thereof (for example, 0-methyl tyrosine or any of the analogs described by Ellman et al., Meth. Enzymol. 202:301, 1991), and (iii) a linkage between the two (for example, an ester, amide, or ketone linkage at the 3xe2x80x2 position or, less preferably, the 2xe2x80x2 position); preferably, this linkage does not significantly perturb the pucker of the ring from the natural ribonucleotide conformation. Peptide acceptors may also possess a nucleophile, which may be, without limitation, an amino group, a hydroxyl group, or a sulfhydryl group. In addition, peptide acceptors may be composed of nucleotide mimetics, amino acid mimetics, or mimetics of the combined nucleotide-amino acid structure. As noted above, puromycin represents a preferred peptide acceptor for use in the present method.
In addition to covalently bonded RNA-protein fusions, any other unique, PCR-amplifiable nucleic acid (for example, RNA, DNA, PNA, or any other nucleic acid which includes two or more covalently bonded, naturally-occurring or modified ribonucleotides or deoxyribonucleotides) can be coupled covalently or non-covalently to each individual binding domain. The protein portions of the fusions are typically composed of naturally-occurring amino acid residues, but may also include amino acid analogs or derivatives, joined by peptide or peptoid bond(s).
Of particular importance is that each binding domain is associated with (and can therefore be identified by) a unique, amplifiable nucleic acid tag, and that each tag in a multiplex reaction is of identical (or essentially identical) length to avoid amplification (for example, PCR) biases. Such unique identification tags are nucleic acid sequences that differ sufficiently in sequence from other tags in a given population or reaction mixture that significant cross-hybridization does not occur under the conditions employed. These unique identification tags may be present in the protein encoding portion of the fusion (for example, the tag can be a randomized portion of the protein scaffold, such as a randomized loop of the 10th domain of fibronectin type III). Alternatively, the unique identification tag can be added to the nucleic acid portion of the fusion molecule and be positioned outside of the nucleic acid sequence which encodes the compound binding domain or, if present, its associated scaffold region. In the latter case, unique identification tags may be chosen which most effectively, most selectively, or most conveniently identify the fusion molecule. For example, if binding reagents are deconvoluted on a DNA chip, tag(s) may be chosen which best hybridize to immobilized chip nucleic acid(s) or which are compatible with commercially available chip arrays. Although DNA chips represent a preferred solid support according to the invention, deconvolution may also be carried out on other solid substrates including, without limitation, any other type of chip (for example, silica-based, glass, or gold chip), glass slide, membrane, bead, solid particle (for example, agarose, sepharose, or magnetic bead), column (or column material), membrane (for example, the membrane of a liposome or vesicle), test tube, or microtiter dish.
Using the affinity binding reagents described above, the present method may be carried out, in one preferred embodiment, as follows. The high affinity binding reagents, each containing a unique affinity binding domain and being present in a mixture of anywhere from 1 to 500,000 (each with equilibrium constants of less than 500 nM), are combined with a sample (for example, a biological sample), under conditions which allow each affinity binding domain to reproducibly recognize a binding partner(s). Following complex formation, the complex is captured. This can be accomplished through any standard procedure, for example, by biotinylation of the biological sample, followed by capture of biotinylated complexes using immobilized streptavidin (for example, streptavidin immobilized on magnetic beads or a column). Alternatively, the initial protein sample may be preabsorbed onto a membrane and the binding domains mixed with the membrane. Complexes remain bound, while unbound binding reagents are washed away.
Following capture of bound complexes, binding domains that have bound their target(s) in the biological sample are detected simply by performing a PCR reaction using primers which hybridize to the nucleic acid portion of the fusion molecule. Preferably, the PCR reaction is carried out using standard quantitative methods (for example, using Taq Man by Perkin-Elmer).
If multiple complexes are isolated, the isolated pool is then deconvoluted and individual members identified. The identification step may be accomplished through direct sequencing. Alternatively, in a preferred feature of the invention, the isolated pool is deconvoluted and bound analytes identified using DNA chip array detection. In one preferred method, the PCR reaction is stopped following predefined cycles, and aliquots extracted. In this way, DNA array detection is performed on each aliquot, allowing for quantitative analysis of amounts of each species present in the pool. Again, a critical feature of the PCR step is that the unique identifiable tag is amplified, and that each amplified segment is the result of using identical primers that generate a DNA product of identical (or essentially identical) size.