The present invention is directed to methods for the use of mass spectrometry for the determination of the structure of a biomolecule especially a nucleic acid target, the site(s) of interaction between ligands and the target, the relative binding affinity of ligands for the target and other useful information. The present invention also provides methods for the use of mass spectrometry for screening chemical mixtures or libraries, especially combinatorial libraries, for individual compounds that bind to a selected target and can be used in pharmaceuticals, veterinary drugs, agricultural chemicals industrial chemicals and otherwise. The present invention is further directed to methods for screening multiple targets simultaneously against, e.g. a combinatorial library of compounds.
A further aspect of the invention provides methods for determining the interaction between one or a plurality of molecular species, especially xe2x80x9csmallxe2x80x9d molecules and a molecular interaction site on a nucleic acid, especially an RNA.
The process of drug discovery is changing at a fast pace because of the rapid progress and evolution of a number of technologies that impact this process. Drug discovery has evolved from what was, several decades ago, essentially random screening of natural products, into a scientific process that not only includes the rational and combinatorial design of large numbers of synthetic molecules as potential bioactive agents, such as ligands, agonists, antagonists, and inhibitors, but also the identification, and mechanistic and structural characterization of their biological targets, which may be polypeptides, proteins, or nucleic acids. These key areas of drug design and structural biology are of tremendous importance to the understanding and treatment of disease. However, significant hurdles need to be overcome when trying to identify or develop high affinity ligands for a particular biological target. These include the difficulty surrounding the task of elucidating the structure of targets and targets to which other molecules may be bound or associated, the large numbers of compounds that need to be screened in order to generate new leads or to optimize existing leads, the need to dissect structural similarities and dissimilarities between these large numbers of compounds, correlating structural features to activity and binding affinity, and the fact that small structural changes can lead to large effects on biological activities of compounds.
Traditionally, drug discovery and optimization have involved the expensive and time-consuming, and therefore slow, process of synthesis and evaluation of single compounds bearing incremental structural changes. When using natural products, the individual components of extracts had to be painstakingly separated into pure constituent compounds prior to biological evaluation. Further, all compounds had to be carefully analyzed and characterized prior to in vitro screening. These screens typically included evaluation of candidate compounds for binding affinity to their target, competition for the ligand binding site, or efficacy at the target as determined via inhibition, cell proliferation, activation or antagonism end points. Considering all these facets of drug design and screening that slow the process of drug discovery, a number of approaches to alleviate or remedy these matters, have been implemented by those involved in discovery efforts.
One way in which the drug discovery process is being accelerated is by the generation of large collections, libraries, or arrays of compounds. The strategy of discovery has moved from selection of drug leads from among compounds that are individually synthesized and tested to the screening of large collections of compounds. These collections may be from natural sources (Sternberg et al., Proc. Natl. Acad. Sci. USA, 1995, 92, 1609-1613) or generated by synthetic methods such as combinatorial chemistry (Ecker and Crooke, Bio/Technology, 1995, 13, 351-360 and U.S. Pat. No. 5,571,902, incorporated herein by reference). These collections of compounds may be generated as libraries of individual, well-characterized compounds synthesized, e.g. via high throughput, parallel synthesis or as a mixture or a pool of up to several hundred or even several thousand molecules synthesized by split-mix or other combinatorial methods. Screening of such combinatorial libraries has usually involved a binding assay to determine the extent of ligand-receptor interaction (Chu et al., J. Am. Chem. Soc., 1996, 118, 7827-35). Often the ligand or the target receptor is immobilized onto a surface such as a polymer bead or plate. Following detection of a binding event, the ligand is released and identified. However, solid phase screening assays can be rendered difficult by non-specific interactions.
Whether screening of combinatorial libraries is performed via solid-phase, solution methods or otherwise, it can be a challenge to identify those components of the library that bind to the target in a rapid and effective manner and which, hence, are of greatest interest. This is a process that needs to be improved to achieve ease and effectiveness in combinatorial and other drug discovery processes. Several approaches to facilitating the understanding of the structure of biopolymeric and other therapeutic targets have also been developed so as to accelerate the process of drug discovery and development. These include the sequencing of proteins and nucleic acids (Smith, in Protein Sequencing Protocols, Humana Press, Totowa, N.J., 1997; Findlay and Geisow, in Protein Sequencing: A Practical Approach, IRL Press, Oxford, 1989; Brown, in DNA Sequencing, IRL Oxford University Press, Oxford, 1994; Adams, Fields and Venter, in Automated DNA Sequencing and Analysis, Academic Press, San Diego, 1994). These also include elucidating the secondary and tertiary structures of such biopolymers via NMR (Jefson, Ann. Rep. in Med. Chem., 1988, 23, 275; Erikson and Fesik, Ann. Rep. in Med. Chem., 1992, 27, 271-289), X-ray crystallography (Erikson and Fesik, Ann. Rep. in Med. Chem., 1992, 27, 271-289) and the use of computer algorithms to attempt the prediction of protein folding (Copeland, in Methods of Protein Analysis: A Practical Guide to Laboratory Protocols, Chapman and Hall, New York, 1994; Creighton, in Protein Folding, W. H. Freeman and Co., 1992). Experiments such as ELISA (Kemeny and Challacombe, in ELISA and other Solid Phase Immunoassays: Theoretical and Practical Aspects; Wiley, New York, 1988) and radioligand binding assays (Berson and Yalow, Clin. Chim. Acta, 1968, 22, 51-60; Chard, in xe2x80x9cAn Introduction to Radioimmunoassay and Related Techniques,xe2x80x9d Elsevier press, Amsterdam/New York, 1982), the use of surface-plasmon resonance (Karlsson, Michaelsson and Mattson, J. Immunol. Methods, 1991, 145, 229; Jonsson et al., Biotechniques, 1991, 11, 620), and scintillation proximity assays (Udenfriend, Gerber and Nelson, Anal. Biochem., 1987, 161, 494-500) are being used to understand the nature of the receptor-ligand interaction.
All of the foregoing paradigms and techniques are now available to persons of ordinary skill in the art and their understanding and mastery is assumed herein.
Likewise, advances have occurred in the chemical synthesis of compounds for high-throughput biological screening. Combinatorial chemistry, computational chemistry, and the synthesis of large collections of mixtures of compounds or of individual compounds have all facilitated the rapid synthesis of large numbers of compounds for in vitro screening. Despite these advances, the process of drug discovery and optimization entails a sequence of difficult steps. This process can also be an expensive one because of the costs involved at each stage and the need to screen large numbers of individual compounds. Moreover, the structural features of target receptors can be elusive.
One step in the identification of bioactive compounds involves the determination of binding affinity of test compounds for a desired biopolymeric or other receptor, such as a specific protein or nucleic acid or combination thereof. For combinatorial chemistry, with its ability to synthesize, or isolate from natural sources, large numbers of compounds for in vitro biological screening, this challenge is magnified. Since combinatorial chemistry generates large numbers of compounds or natural products, often isolated as mixtures, there is a need for methods which allow rapid determination of those members of the library or mixture that are most active or which bind with the highest affinity to a receptor target.
From a related perspective, there are available to the drug discovery scientist a number of tools and techniques for the structural elucidation of biologically interesting targets, for the determination of the strength and stoichiometry of target-ligand interactions, and for the determination of active components of combinatorial mixtures.
Techniques and instrumentation are available for the sequencing of biological targets such as proteins and nucleic acids (e.g. Smith, in Protein Sequencing Protocols, 1997 and Findlay and Geisow, in Protein Sequencing: A Practical Approach, 1989) cited previously. While these techniques are useful, there are some classes and structures of biopolymeric target that are not susceptible to such sequencing efforts, and, in any event, greater convenience and economy have been sought. Another drawback of present sequencing techniques is their inability to reveal anything more than the primary structure, or sequence, of the target.
While X-ray crystallography is a very powerful technique that can allow for the determination of some secondary and tertiary structure of biopolymeric targets (Erikson and Fesik, Ann. Rep. in Med. Chem., 1992, 27, 271-289), this technique can be an expensive procedure and very difficult to accomplish. Crystallization of biopolymers is extremely challenging, difficult to perform at adequate resolution, and is often considered to be as much an art as a science. Further confounding the utility of X-ray crystal structures in the drug discovery process is the inability of crystallography to reveal insights into the solution-phase, and therefore the biologically relevant, structures of the targets of interest.
Some analysis of the nature and strength of interaction between a ligand (agonist, antagonist, or inhibitor) and its target can be performed by ELISA (Kemeny and Challacombe, in ELISA and other Solid Phase Immunoassays: 1988), radioligand binding assays (Berson and Yalow, Clin. 1968, Chard, in xe2x80x9cAn Introduction to Radioimmunoassay and Related Techniques,xe2x80x9d 1982), surface-plasmon resonance (Karlsson, Michaelsson and Mattson, 1991, Jonsson et al., Biotechniques, 1991), or scintillation proximity assays (Udenfriend, Gerber and Nelson, Anal. Biochem., 1987), all cited previously. The radioligand binding assays are typically useful only when assessing the competitive binding of the unknown at the biding site for that of the radioligand and also require the use of radioactivity. The surface-plasmon resonance technique is more straightforward to use, but is also quite costly. Conventional biochemical assays of binding kinetics, and dissociation and association constants are also helpful in elucidating the nature of the target-ligand interactions.
When screening combinatorial mixtures of compounds, the drug discovery scientist will conventionally identify an active pool, deconvolute it into its individual members via resynthesis, and identify the active members via analysis of the discrete compounds. Current techniques and protocols for the study of combinatorial libraries against a variety of biologically relevant targets have many shortcomings. The tedious nature, high cost, multi-step character, and low sensitivity of many of the above-mentioned screening technologies are shortcomings of the currently available tools. Further, available techniques do not always afford the most relevant structural informationxe2x80x94the structure of a target in solution, for example. Instead they provide insights into target structures that may only exist in the solid phase. Also, the need for customized reagents and experiments for specific tasks is a challenge for the practice of current drug discovery and screening technologies. Current methods also fail to provide a convenient solution to the need for deconvolution and identification of active members of libraries without having to perform tedious re-syntheses and re-analyses of discrete members of pools or mixtures.
Therefore, methods for the screening and identification of complex chemical libraries especially combinatorial libraries are greatly needed such that one or more of the structures of both the target and ligand, the site of interaction between the target and ligand, and the strength of the target-ligand interaction can be determined. Further, in order to accelerate drug discovery, new methods of screening combinatorial libraries are needed to provide ways for the direct identification of the bioactive members from a mixture and to allow for the screening of multiple biomolecular targets in a single procedure. Straightforward methods that allow selective and controlled cleavage of biopolymers, while also analyzing the various fragments to provide structural information, would be of significant value to those involved in biochemistry and drug discovery and have long been desired. Also, it is preferred that the methods not be restricted to one type of biomolecular target, but instead be applicable to a variety of targets such as nucleic acids, peptides, proteins and oligosaccharides.
A principal object of the present invention is to provide novel methods for the determination of the structure of biomolecular targets and ligands that interact with them and to ascertain the nature and sites of such interactions.
A further object of the invention is to determine the structural features of biomolecular targets such as peptides, proteins, oligonucleotides, and nucleic acids such as the primary sequence, the secondary and folded structures of biopolymers, and higher order tertiary and quaternary structures of biomolecules that result from intramolecular and intermolecular interactions.
Yet another object of the invention is to determine the site(s) and nature of interaction between a biomolecular target and a binding ligand or ligands. The binding ligand may be a xe2x80x9csmallxe2x80x9d molecule, a biomolecule such as a peptide, oligonucleotide or oligosaccharide, a natural product, or a member of a combinatorial library.
A further object of the invention is to determine the relative binding affinity or dissociation constant of ligands that bind to biopolymer targets. Preferably, this gives rise to a determination of relative binding affinities between a biopolymer such as an RNA/DNA target and ligands e.g. members of combinatorially synthesized libraries.
A still further object of the present invention is to provide a general method for the screening of combinatorial libraries comprising individual compounds or mixtures of compounds against a biomolecular target such as a nucleic acid, so as to determine which components of the library bind to the target.
An additional object of the present invention is to provide methods for the determination of the molecular weight and structure of those members of a combinatorial library that bind to a biomolecular target.
Yet another object of the invention is to provide methods for screening multiple targets such as nucleic acids, proteins, and other biomolecules and oligomers simultaneously against a combinatorial library of compounds.
A still further object of the invention is to ascertain the specificity and affinity of compounds, especially xe2x80x9csmallxe2x80x9d organic molecules to bind to or interact with molecular interaction sites of biological molecules, especially nucleic acids such as RNA. Such molecules may be and preferably do form ranked hierarchies of ligands and potential ligands for the molecular interaction sites, ranked in accordance with predicted or calculated likelihood of interaction with such sites.
Another object of the present invention is to alleviate the problem of peak overlap in mass spectra generated from the analysis of mixtures of screening targets and combinatorial or other mixtures of compounds. In a preferred embodiment, the invention provides methods to solve the problems of mass redundancy in combinatorial or other mixtures of compounds, and also provides methods to solve the problem of mass redundancy in the mixture of targets being screened.
A further object of the invention is to provide methods for determining the binding specificity of a ligand for a target in comparison to a control. The present invention facilitates the determination of selectivity, the identification of non-specific effects and the elimination of non-specific ligands from further consideration for drug discovery efforts.
The present invention provides, inter alia, a series of new methods and applications for the determination of the structure and nature of binding of ligands to a wide variety of biomolecular targets. This new approach provides structural information for screening combinatorial libraries for drug lead discovery.
One aspect of the invention is a method to determine the structure of biomolecular targets such as nucleic acids using mass spectrometry. The method provides not only the primary, sequence structure of nucleic acid targets, but also information about the secondary and tertiary structure of nucleic acids, RNA and DNA, including mismatched base pairs, loops, bulges, kinks, and stem structures. This can be accomplished in accordance with one embodiment by incorporating deoxynucleotide residues or other modified residues into an oligoribonucleotide at specific sites followed by selective cleavage of these hybrid RNA/DNA nucleic acids in a mass spectrometer. It has now been found that electrospray ionization of the nucleic acid, cleavage of the nucleic acid, and subsequent tandem MSn spectrometry affords a pattern of fragments that is indicative of the nucleic acid sequence and structure. Cleavage is dependent on the sites of incorporation of the deoxynucleotide or other foreign residues and the secondary structure of the nucleic acid. This method therefore provides mass spectral data that identifies the sites and types of secondary structure present in the sequence of nucleic acids.
When the present methods are performed on a mixture of the biomolecular target and a ligand or molecule that binds to the target, it is possible to ascertain both the extent of interaction and the location of this interaction between ligand and biomolecule. The binding of the ligand to the biomolecule protects the binding site on the biomolecule from facile cleavage during mass spectrometry. Therefore, comparison of ESI-MSn mass spectra generated, using this method, for RNA/DNA in the presence and the absence of a binding ligand or drug reveals the location of binding. This altered cleavage pattern is clearly discerned in the mass spectrum and correlated to the sequence and structure of the nucleic acid. Comparison of the abundance of the nucleic acid-ligand noncovalent complex ion to the abundance of a similar complex ion generated from a standard compound (such as paromomycin for the 16S RNA A site ) whose binding affinity is known, allows for the determination of relative binding affinity of the test ligand.
The methods of this invention can be used for the rapid screening of large collections of compounds. It is also possible to screen mixtures of large numbers of compounds that are generated via combinatorial or other means. When a large mixture of compounds is exposed to a biomolecular target, such as a nucleic acid, a small fraction of ligands may exhibit some binding affinity to the nucleic acid. The actual number of ligands that may be detected as binders is based on the concentration of the nucleic acid target, the relative concentrations of the components of the combinatorial mixture, and the relative binding affinities of these components. The method is capable of separating different noncovalent complexes, using techniques such as selective ion trapping, or accumulation and analyzing each complex for the structure and identity of the bound ligand using collisionally activated dissociation or MSn experiments. The methods of this invention, therefore, can not only serve as methods to screen combinatorial libraries for molecules that bind to biomolecular targets, but can also provide, in a straightforward manner, the structural identity of the bound ligands. In this manner, any mass redundancy in the combinatorial library does not pose a problem, as the methods can provide high resolution molecular masses and also able to discern differences between the different structures of ligands of identical molecular mass using tandem methods.
In accordance with preferred embodiments, a target biomolecule such as an RNA having a molecular interaction site, is presented with one or more ligands or suspected ligands for the interaction site under conditions such that interaction or binding of the ligand to the molecular interaction site can occur. The resulting complex, which may be of one or even hundreds of individual complexes of ligands with the RNA or other biomolecule, is then subjected to mass spectrometric evaluation in accordance with the invention. xe2x80x9cPreparativexe2x80x9d mass spectrometry can isolate individual complexes which can then be fragmented under controlled conditions within the mass spectrometric environment for subsequent analysis. In this way, the nature and degree of binding of the ligands to the molecular interaction site can be ascertained. Identification of specific, strong binding ligands can be made and those selected for use either as therapeutics, agricultural, industrial or other chemicals, or the same used as lead compounds for subsequent modification into improved forms for such uses.
A further application of the present invention is the use of mass spectrometric methods for the simultaneous screening of multiple biomolecular targets against combinatorial libraries or mixtures of compounds. This rather complex screening procedure is made possible by the combined power of the mass spectrometric methods used and the way in which the screening is performed. When screening multiple target nucleic acids, for example, mass redundancy is a concern, especially if two or more targets are of similar sequence composition or mass. This problem is alleviated by the present invention, by using special mass modifying, molecular weight tags on the different nucleic acid targets being studied. These mass modifying tags are typically large molecular weight, non-ionic polymers including but not limited to, polyethylene glycols, polyacrylamides and dextrans, that are available in many different sizes and weights, and which may be attached at one or more of many different possible sites on nucleic acids. Thus similar nucleic acid targets may be differentially tagged and now be readily differentiated, in the mass spectrum, from one another by their distinctly different mass to charge ratios (m/z signals). Using the methods of this invention, screening efforts can be significantly accelerated because multiple targets can now be screened simultaneously against mixtures of large numbers of compounds.
Another related advantage of the methods of this invention is the ability to determine the specificity of binding interactions between a new ligand and a biomolecular target. By simultaneously screening a target nucleic acid, for example, and one or more control nucleic acids against a combinatorial library or a specific ligand, it is possible to ascertain, using the methods of this invention, whether the ligand binds specifically to only the target nucleic acids, or whether the binding observed with the target is reproduced with control nucleic acids and is therefore non-specific.
The methods of the invention are applicable to the study of a wide variety of biomolecular targets that include, but are not limited to, peptides, proteins, receptors, antibodies, oligonucleotides, RNA, DNA, RNA/DNA hybrids, nucleic acids, oligosaccharides, carbohydrates, and glycopeptides. The molecules that may be screened by using the methods of this invention include, but are not limited to, organic or inorganic, small to large molecular weight individual compounds, mixtures and combinatorial libraries of ligands, inhibitors, agonists, antagonists, substrates, and biopolymers, such as peptides, nucleic acids or oligonucleotides. The mass spectrometric techniques which can be used in the methods of the invention include, but are not limited to, MSn, collisionally activated dissociation (CAD) and collisionally induced dissociation (CID) and infrared multiphoton dissociation (IRMPD). A variety of ionization techniques may be used including, but not limited to, electrospray, MALDI and FAB. The mass detectors used in the methods of this invention include, but are not limited to, FTICR, ion trap, quadrupole, magnetic sector, time of flight (TOF), Q-TOF, and triple quadrupole. The methods of this invention may also use xe2x80x9chyphenatedxe2x80x9d techniques such as, but not limited to, LC/MS and CE/MS, all as described more fully hereinafter.