Most proteins or small molecules are not known to specifically bind to nucleic acids. The known protein exceptions are those regulatory proteins such as repressors, polymerases, activators and the like which function in a living cell to bring about the transfer of genetic information encoded in the nucleic acids into cellular structures and the replication of the genetic material. Furthermore, small molecules such as GTP bind to some intron RNAs.
Living matter has evolved to limit the function of nucleic acids to a largely informational role. The Central Dogma, as postulated by Crick, both originally and in expanded form, proposes that nucleic acids (either RNA or DNA) can serve as templates for the synthesis of other nucleic acids through replicative processes that "read" the information in a template nucleic acid and thus yield complementary nucleic acids. All of the experimental paradigms for genetics and gene expression depend on these properties of nucleic acids: in essence, double-stranded nucleic acids are informationally redundant because of the chemical concept of base pairs and because replicative processes are able to use that base pairing in a relatively error-free manner.
The individual components of proteins, the twenty natural amino acids, possess sufficient chemical differences and activities to provide an enormous breadth of activities for both binding and catalysis. Nucleic acids, however, are thought to have narrower chemical possibilities than proteins, but to have an informational role that allows genetic information to be passed from virus to virus, cell to cell, and organism to organism. In this context nucleic acid components, the nucleotides, possess only pairs of surfaces that allow informational redundancy within a Watson-Crick base pair. Nucleic acid components need not possess chemical differences and activities sufficient for either a wide range of binding or catalysis.
However, some nucleic acids found in nature do participate in binding to certain target molecules and even a few instances of catalysis have been reported. The range of activities of this kind is narrow compared to proteins and more specifically antibodies. For example, where nucleic acids are known to bind to some protein targets with high affinity and specificity, the binding depends on the exact sequences of nucleotides that comprise the DNA or RNA ligand. Thus, short double-stranded DNA sequences are known to bind to target proteins that repress or activate transcription in both prokaryotes and eukaryotes. Other short double-stranded DNA sequences are known to bind to restriction endonucleases, protein targets that can be selected with high affinity and specificity. Other short DNA sequences serve as centromeres and telomeres on chromosomes, presumably by creating ligands for the binding of specific proteins that participate in chromosome mechanics. Thus, double-stranded DNA has a well-known capacity to bind within the nooks and crannies of target proteins whose functions are directed to DNA binding. Single-stranded DNA can also bind to some proteins with high affinity and specificity, although the number of examples is rather smaller. From the known examples of double-stranded DNA binding proteins, it has become possible to describe some of the binding interactions as involving various protein motifs projecting amino acid side chains into the major groove of B form double-stranded DNA, providing the sequence inspection that allows specificity.
Double-stranded RNA occasionally serves as a ligand for certain proteins, for example, the endonuclease RNase III from E. coli. There are more known instances of target proteins that bind to single-stranded RNA ligands, although in these cases the single-stranded RNA often forms a complex three-dimensional shape that includes local regions of intramolecular double-strandedness. The amino-acyl tRNA synthetases bind tightly to tRNA molecules with high specificity. A short region within the genomes of RNA viruses binds tightly and with high specificity to the viral coat proteins. A short sequence of RNA binds to the bacteriophage T4-encoded DNA polymerase, again with high affinity and specificity. Thus, it is possible to find RNA and DNA ligands, either double- or single-stranded, serving as binding partners for specific protein targets. Most known DNA binding proteins bind specifically to double-stranded DNA, while most RNA binding proteins recognize single-stranded RNA. This statistical bias in the literature no doubt reflects the present biosphere's statistical predisposition to use DNA as a double-stranded genome and RNA as a single-stranded entity in the roles RNA plays beyond serving as a genome. Chemically there is no strong reason to dismiss single-stranded DNA as a fully able partner for specific protein interactions.
RNA and DNA have also been found to bind to smaller target molecules. Double-stranded DNA binds to various antibiotics, such as actinomycin D. A specific single-stranded RNA binds to the antibiotic thiostreptone; specific RNA sequences and structures probably bind to certain other antibiotics, especially those whose functions is to inactivate ribosomes in a target organism. A family of evolutionary related RNAs binds with specificity and decent affinity to nucleotides and nucleosides (Bass, B. and Cech, T. (1984) Nature 308:820-826) as well as to one of the twenty amino acids (Yarus, M. (1988) Science 240:1751-1758). Catalytic RNAs are now known as well, although these molecules perform over a narrow range of chemical possibilities, which are thus far related largely to phosphodiester transfer reactions and hydrolysis of nucleic acids.
Despite these known instances, the great majority of proteins and other cellular components are thought not to bind to nucleic acids under physiological conditions and such binding as may be observed is non-specific. Either the capacity of nucleic acids to bind other compounds is limited to the relatively few instances enumerated supra, or the chemical repertoire of the nucleic acids for specific binding is avoided (selected against) in the structures that occur naturally. The present invention is premised on the inventors' fundamental insight that nucleic acids as chemical compounds can form a virtually limitless array of shapes, sizes and configurations, and are capable of a far broader repertoire of binding and catalytic functions than those displayed in biological systems.
The chemical interactions have been explored in cases of certain known instances of protein-nucleic acid binding. For example, the size and sequence of the RNA site of bacteriophage R17 coat protein binding has been identified by Uhlenbeck and coworkers. The minimal natural RNA binding site (21 bases long) for the R17 coat protein was determined by subjecting variable-sized labeled fragments of the mRNA to nitrocellulose filter binding assays in which protein-RNA fragment complexes remain bound to the filter (Carey et al. (1983) Biochemistry 22:2601). A number of sequence variants of the minimal R17 coat protein binding site were created in vitro in order to determine the contributions of individual nucleic acids to protein binding (Uhlenbeck et al. (1983) J. Biomol. Structure Dynamics 1:539 and Romaniuk et al. (1987) Biochemistry 26:1563). It was found that the maintenance of the hairpin loop structure of the binding site was essential for protein binding but, in addition, that nucleotide substitutions at most of the single-stranded residues in the binding site, including a bulged nucleotide in the hairpin stem, significantly affected binding. In similar studies, the binding of bacteriophage Q.beta. coat protein to its translational operator was examined (Witherell and Uhlenbeck (1989) Biochemistry 28:71). The Q.beta. coat protein RNA binding site was found to be similar to that of R17 in size, and in predicted secondary structure, in that it comprised about 20 bases with an 8 base pair hairpin structure which included a bulged nucleotide and a 3 base loop. In contrast to the R17 coat protein binding site, only one of the single-stranded residues of the loop is essential for binding and the presence of the bulged nucleotide is not required. The protein-RNA binding interactions involved in translational regulation display significant specificity.
Nucleic acids are known to form secondary and tertiary structures in solution. The double-stranded forms of DNA include the so-called B double-helical form, Z-DNA and superhelical twists (Rich, A. et al. (1984) Ann. Rev. Biochem. 53:791-846). Single-stranded RNA forms localized regions of secondary structure such as hairpin loops and pseudoknot structures (Schimmel, P. (1989) Cell 58:9-12). However, little is known concerning the effects of unpaired loop nucleotides on stability of loop structure, kinetics of formation and denaturation, thermodynamics, and almost nothing is known of tertiary structures and three dimensional shape, nor of the kinetics and thermodynamics of tertiary folding in nucleic acids (Tuerk, C. et al. (1988) Proc. Natl. Acad. Sci. USA 85:1364-1368).
A type of in vitro evolution was reported in replication of the RNA bacteriophage Q.beta.. Mills, D. R. et al. (1967) Proc. Natl. Acad. Sci USA 58:217-224; Levisohm, R. and Spiegelman, S. (1968) Proc. Natl. Acad. Sci. USA 60:866-872; Levisohm, R. and Spiegelman S. (1969) Proc. Natl. Acad. Sci. USA 63:805-811; Saffhill, R. et al. (1970) J. Mol. Biol. 51:531-539; Kacian, D. L. et al. (1972) Proc. Natl. Acad. Sci. USA 69:3038-3042; Mills, D. R. et al. (1973) Science 180:916-927. The phage RNA serves as a poly-cistronic messenger RNA directing translation of phage-specific proteins and also as a template for its own replication catalyzed by Q.beta. RNA replicase. This RNA replicase was shown to be highly specific for its own RNA templates. During the course of cycles of replication in vitro small variant RNAs were isolated which were also replicated by Q.beta. replicase. Minor alterations in the conditions under which cycles of replication were performed were found to result in the accumulation of different RNAs, presumably because their replication was favored under the altered conditions. In these experiments, the selected RNA had to be bound efficiently by the replicase to initiate replication and had to serve as a kinetically favored template during elongation of RNA. Kramer et al. (1974) J. Mol. Biol. 89:719 reported the isolation of a mutant RNA template of Q.beta. replicase, the replication of which was more resistant to inhibition by ethidium bromide than the natural template. It was suggested that this mutant was not present in the initial RNA population but was generated by sequential mutation during cycles of in vitro replication with Q.beta. replicase. The only source of variation during selection was the intrinsic error rate during elongation by Q.beta. replicase. In these studies what was termed "selection" occurred by preferential amplification of one or more of a limited number of spontaneous variants of an initially homogenous RNA sequence. There was no selection of a desired result, only that which was intrinsic to the mode of action of Q.beta. replicase.
Joyce and Robertson (Joyce (1989) in RNA: Catalysis, Splicing, Evolution, Belfort and Shub (eds.), Elsevier, Amsterdam pp. 83-87; and Robertson and Joyce (1990) Nature 344:467) reported a method for identifying RNAs which specifically cleave single-stranded DNA. The selection for catalytic activity was based on the ability of the ribozyme to catalyze the cleavage of a substrate ssRNA or DNA at a specific position and transfer the 3'-end of the substrate to the 3'-end of the ribozyme. The product of the desired reaction was selected by using a deoxyoligonucleotide primer which could bind only to the completed product across the junction formed by the catalytic reaction and allowed selective reverse transcription of the ribozyme sequence. The selected catalytic sequences were amplified by attachment of the promoter of T7 RNA polymerase to the 3'-end of the cDNA, followed by transcription to RNA. The method was employed to identify from a small number of ribozyme variants the variant that was most reactive for cleavage of a selected substrate.
The prior art has not taught or suggested more than a limited range of chemical functions for nucleic acids in their interactions with other substances: as targets for proteins that had evolved to bind certain specific oligonucleotide sequences; and more recently, as catalysts with a limited range of activities. Prior "selection" experiments have been limited to a narrow range of variants of a previously described function. Now, for the first time, it will be understood that the nucleic acids are capable of a vastly broad range of functions and the methodology for realizing that capability is disclosed herein.
U.S. patent application Ser. No. 07/536,428 filed Jun. 11, 1990, of Gold and Tuerk, entitled Systematic Evolution of Ligands by Exponential Enrichment, and U.S. patent application Ser. No. 07/714,131 filed Jun. 10, 1992 of Gold and Tuerk, entitled Nucleic Acid Ligands (See also PCT/US91/04078) describe a fundamentally novel method for making a nucleic acid ligand for any desired target. Each of these applications, collectively referred to herein as the SELEX Patent Applications, is specifically incorporated herein by reference.
The method of the SELEX Patent Applications is based on the unique insight that nucleic acids have sufficient capacity for forming a variety of two- and three-dimensional structures and sufficient chemical versatility available within their monomers to act as ligands (form specific binding pairs) with virtually any chemical compound, whether large or small in size.
The method involves selection from a mixture of candidates and step-wise iterations of structural improvement, using the same general selection theme, to achieve virtually any desired criterion of binding affinity and selectivity. Starting from a mixture of nucleic acids, preferably comprising a segment of randomized sequence, the method, termed SELEX herein, includes steps of contacting the mixture with the target under conditions favorable for binding, partitioning unbound nucleic acids from those nucleic acids which have bound to target molecules, dissociating the nucleic acid-target pairs, amplifying the nucleic acids dissociated from the nucleic acid-target pairs to yield a ligand-enriched mixture of nucleic acids, then reiterating the steps of binding, partitioning, dissociating and amplifying through as many cycles as desired.
While not bound by a theory of preparation, SELEX is based on the inventors' insight that within a nucleic acid mixture containing a large number of possible sequences and structures there is a wide range of binding affinities for a given target. A nucleic acid mixture comprising, for example a 20 nucleotide randomized segment can have 4.sup.20 candidate possibilities. Those which have the higher affinity constants for the target are most likely to bind. After partitioning, dissociation and amplification, a second nucleic acid mixture is generated, enriched for the higher binding affinity candidates. Additional rounds of selection progressively favor the best ligands until the resulting nucleic acid mixture is predominantly composed of only one or a few sequences. These can then be cloned, sequenced and individually tested for binding affinity as pure ligands.
Cycles of selection and amplification are repeated until a desired goal is achieved. In the most general case, selection/amplification is continued until no significant improvement in binding strength is achieved on repetition of the cycle. The method may be used to sample as many as about 10.sup.18 different nucleic acid species. The nucleic acids of the test mixture preferably include a randomized sequence portion as well as conserved sequences necessary for efficient amplification. Nucleic acid sequence variants can be produced in a number of ways including synthesis of randomized nucleic acid sequences and size selection from randomly cleaved cellular nucleic acids. The variable sequence portion may contain fully or partially random sequence; it may also contain subportions of conserved sequence incorporated with randomized sequence. Sequence variation in test nucleic acids can be introduced or increased by mutagenesis before or during the selection/amplification iterations.
In one embodiment of the method of the SELEX Patent Applications, the selection process is so efficient at isolating those nucleic acid ligands that bind most strongly to the selected target, that only one cycle of selection and amplification is required. Such an efficient selection may occur, for example, in a chromatographic-type process wherein the ability of nucleic acids to associate with targets bound on a column operates in such a manner that the column is sufficiently able to allow separation and isolation of the highest affinity nucleic acid ligands.
In many cases, it is not necessarily desirable to perform the iterative steps of SELEX until a single nucleic acid ligand is identified. The target-specific nucleic acid ligand solution may include a family of nucleic acid structures or motifs that have a number of conserved sequences and a number of sequences which can be substituted or added without significantly effecting the affinity of the nucleic acid ligands to the target. By terminating the SELEX process prior to completion, it is possible to determine the sequence of a number of members of the nucleic acid ligand solution family.
A variety of nucleic acid primary, secondary and tertiary structures are known to exist. The structures or motifs that have been shown most commonly to be involved in non-Watson-Crick type interactions are referred to as hairpin loops, symmetric and asymmetric bulges, Psuedoknots and myriad combinations of the same. Almost all known cases of such motifs suggest that they can be formed in a nucleic acid sequence of no more than 30 nucleotides. For this reason, it is often preferred that SELEX procedures with contiguous randomized segments be initiated with nucleic acid sequences containing a randomized segment of between about 20-50 nucleotides.
The SELEX Patent Applications also describe methods for obtaining nucleic acid ligands that bind to more than one site on the target molecule, and to nucleic acid ligands that include non-nucleic acid species that bind to specific sites on the target. The SELEX method provides means for isolating and identifying nucleic acid ligands which bind to any envisonable target. However, in preferred embodiments the SELEX method is applied to situations where the target is a protein, including both nucleic acid-binding proteins and proteins not known to bind nucleic acids as part of their biological function.
Thrombin is a multifunctional serine protease that has important procoagulant and anticoagulant activities. As a procoagulant enzyme thrombin clots fibrinogen, activates clotting factors V, VIII, and XIII, and activates platelets. The specific cleavage of fibrinogen by thrombin initiates the polymerization of fibrin monomers, a primary event in blood clot formation. The central event in the formation of platelet thrombi is the activation of platelets from the "nonbinding" to the "binding" mode and thrombin is the most potent physiologic activator of platelet aggregation (Berndt and Phillips (1981) in Platelets in Biology and Pathology, J. L. Gordon, ed. (Amsterdam:Elsevier/North Holland Biomedical Press), pp. 43-74; Hansen and Harker (1988) Proc. Natl. Acad. Sci. USA 85:3184-3188; Eidt et al. (1989) J. Clin. Invest. 84:18-27). Thus, as a procoagulant, thrombin plays a key role in the arrest of bleeding (physiologic hemostasis) and formation of vasoocclusive thrombi (pathologic thrombosis).
As an anticoagulant thrombin binds to thrombomodulin (TM), a glycoprotein expressed on the surface of vascular endothelial cells. TM alters substrate specificity from fibrinogen and platelets to protein C through a combination of an allosteric change in the active site conformation and an overlap of the TM and fibrinogen binding sites on thrombin. Activated protein C, in the presence of a phospholipid surface, Ca.sup.2+, and a second vitamin K-dependent protein cofactor, protein S, inhibits coagulation by proteolytically degrading factors Va and VIIIa. Thus the formation of the thrombin-TM complex converts thrombin from a procoagulant to an anticoagulant enzyme, and the normal balance between these opposing activities is critical to the regulation of hemostasis.
Thrombin is also involved in biological responses that are far removed from the clotting system (reviewed in Zimmerman et al (1986) Ann. N. Y. Acad. Sci. 485:349-368; Marx (1992) Science 256:1278-1280). Thrombin is chemotactic for monocytes (Bar-Shavit et al. (1983) Science 220:728-730), mitogenic for lymphocytes (Chen et al. (1976) Exp. Cell Res. 101:41-46), mesenchymal cells (Chen and Buchanan (1975) Proc. Natl. Acad. Sci. USA 72:131-135), and fibroblasts (Marx (1992) supra). Thrombin activates endothelial cells to express the neutrophil adhesive protein GMP-140 (PADGEM) (Hattori et al. (1989) J. Biol. Chem. 264:7768-7771) and produce platelet-derived growth factor (Daniel et al. (1986) J. Biol. CHem. 261:9579-9582). Recently it has been shown that thrombin causes cultured nerve cells to retract their neurites (reviewed in Marx (1992) supra.
The mechanism by which thrombin activates platelets and endothelial cells is through a functional thrombin receptor found on these cells. A putative thrombin cleavage site (LDR/S) in the receptor suggests that the thrombin receptor is activated by proteolytic cleavage of the receptor. This cleavage event "unmasks" an N-terminal domain which then acts as the ligand, activating the receptor (Vu et al. (1991) Cell 64.:1057-1068).
Vascular injury and thrombus formation represent the key events in the pathogenesis of various vascular diseases, including atherosclerosis. The pathogenic processes of the activation of platelets and/or the clotting system leading to thrombosis in various disease states and in various sites, such as the coronary arteries, cardiac chambers, and prosthetic heart valves, appear to be different. Therefore, the use of a platelet inhibitor, an anticoagulant, or a combination of both may be required in conjunction with thrombolytics to open closed vessels and prevent reocclusion.
Controlled proteolysis by compounds of the coagulation cascade is critical for hemostasis. As a result, a variety of complex regulatory systems exist that are based, in part, on a series of highly specific protease inhibitors. In a pathological situation functional inhibitory activity can be interrupted by excessive production of active protease or inactivation of inhibitory activity. Perpetuation of inflammation in response to multiple trauma (tissue damage) or infection (sepsis) depends on proteolytic enzymes, both of plasma cascade systems, including thrombin, and lysosomal origin. Multiple organ failure (MOF) in these cases is enhanced by the concurrently arising imbalance between proteases and their inhibitory regulators. An imbalance of thrombin activity in the brain may lead to neurodegenerative diseases.
Thrombin is naturally inhibited in hemostasis by binding to antithrombin III (ATIII), in a heparin-dependent reaction. Heparin exerts its effect through its ability to accelerate the action of ATIII. In the brain, protease nexin (PN-1) may be the natural inhibitor of thrombin to regulate neurite outgrowth.
Heparin is a glycosoaminoglycan composed of chains of alternating residues of D-glucosamine and uronic acid. Heparin is currently used extensively as an anticoagulant in the treatment of unstable angina, pulmonary embolism, atherosclerosis, thrombosis, and following myocardial infarction. Its anticoagulant effect is mediated through its interaction with ATIII. When heparin binds ATIII, the conformation of ATIII is altered, and it becomes a significantly enhanced inhibitor of thrombin. Although heparin is generally considered to be effective for certain indications, it is believed that the physical size of the ATIII.heparin complex prevents access to much of the biologically active thrombin in the body, thus diminishing its ability to inhibit clot formation. Side effects of heparin include bleeding, thrombocytopenia, osteoporosis, skin necrosis, alpe, hypersensitivity and hypoaldoseronism.
Hirudin is a potent peptide inhibitor of thrombin derived from the European medicinal leech Hirudis medicinalis. Hirudin inhibits all known functions of .alpha.-thrombin, and has been shown to bind thrombin at two separate sites kinetically; a high affinity site at or near the catalytic site for serine protease activity and a second anionic exosite. The anionic exosite also binds fibrinogen, heparin, TM and probably the receptor involved in mediating the activation of platelets and endothelial cells. A C-terminal hirudin peptide--which has been shown by co-crystallization with thrombin to bind in the anionic exosite--has inhibitory effects on fibrin formation, platelet and endothelial cell activation, and Protein C activation via TM binding, presumably by competing for binding at this site. This peptide does not inhibit proteolytic activity towards tripeptide chromogenic substrates, Factor V or X.
The structure of thrombin makes it a particularly desirable target for nucleic acid binding, due to the anionic exosite. Site-directed mutagenesis within this site has shown that fibrinogen-clotting and TM binding activities are separable. Conceivably, an RNA ligand could be selected that has procoagulatory and/or anticoagulatory effects depending on how it interacts with thrombin, i.e., which substrate it mimics.
A single stranded DNA ligand to thrombin has been prepared according to a procedure identical to SELEX. See, Bock et al. (1992) Nature 355:564-565. A consensus ligand was identified after relatively few rounds of SELEX were performed, that was shown to have some ability to prevent clot formation in vitro. The ligand is the 15mer DNA 5'GGTTGGTGTGGTTGG-3', referred to herein as G15D (SEQ ID NO:1). The symmetrical nature of the primary sequence suggests that G15D has a regular fixed tertiary structure. The kD of G15D to thrombin is about 2.times.10.sup.-7. For effective thrombin inhibition as an anticoagulant, the stronger the affinity of the ligand to thrombin the better.