Over the past decade, research in the field of proteomics has expanded tremendously due to its potential to revolutionize biological and medical research, particularly in the development of new drugs and therapies. The term proteome denotes an entire set of proteins that is encoded by a genome. The study of the proteome, called proteomics, is a complex interdisciplinary field of research directed at developing a functional description of gene and cellular activity in terms of the biological activities, interactions, localization, expression and modifications of proteins and protein complexes in cells and tissue.
The complexity of the field of proteomics is underscored by the very large number of proteins and protein complexes which correspond to a typical genome. The human proteome, for example, is considered to comprise between about 400,000 to about 1,000,000 proteins that interact to form an enormous number of protein complexes, many of which are believed to play fundamental roles in regulation of cellular activity and the onset of disease. This complexity is further compounded by the importance of a wide range of post-translational modification processes that profoundly affect protein tertiary structure, biological activity and cellular function.
As a result of the extraordinarily large number of proteins and protein complexes that participate in gene and cellular bioactivity, a variety of high throughput methods of probing protein interactions under conditions representative of cellular conditions have emerged over the last several years. Most of these techniques, genetic readout experiments such as the yeast two-hybrid assay, micro-array and chip experiments, mass spectrometry methods, and fluorescence based assay methods, take advantage of comprehensive genome and proteome databases that have become available in recent years. The available high throughput screening techniques provide complementary approaches to identifying and characterizing protein-protein interactions, protein-DNA interactions, protein-lipid interactions and post translational modifications that may be important in cellular activity. A fundamental goal of this aspect of proteomic research is to characterize cellular processes and the progression of disease in terms of networks of specific, identified protein interactions.
Networks of protein interactions developed from proteomic studies are particularly useful for identifying proteins and peptides that provide promising target molecules for the development of new drugs and therapies. Target proteins and peptides are molecules or components of molecules that are believed to participate in biochemical pathways associated with cellular development, regulation and/or the onset of disease. The composition, structure and reactivity of such target molecules are of great interest because surface targets may serve the basis of a drug therapy. For example, in vivo manipulation of the composition, conformation and/or biological activity of a protein involved in a select biochemical pathway, for example by administering a compound that inhibits, reduces or enhances its biological activity, may provide a mechanism for affecting cellular activity in a concerted manner. Target identification is typically followed by target validation studies, therefore, that confirm that manipulation of the selected target molecule has a desirable impact on cellular activity, such as the prevention of the progression of disease.
Recent availability of libraries of target protein and peptides generated via proteomic studies has stimulated considerable interest in developing new methods of identifying compounds that interact with target proteins or peptides and which may serve the basis of new and improved drugs and therapies. For example, substantial research efforts have been direct at concurrently developing rational, structure-based approaches and high throughput screening approaches for developing drugs and drug therapies from the abundance of available target protein and peptide data. Although these systematic methods employ fundamentally different approaches to identifying and refining therapeutic candidates, rational, structure-based methods and high-throughput methods are highly complementary techniques and are often used in combination wherein promising lead compounds are identified via high throughput methods and serve the basis of improved compounds developed by rational, structural based drug design methods.
In a rational, structure based drug design program, a small set of candidate molecules are developed based on known structural motifs of a selected target molecule or one of its natural ligands. The design of candidate molecules typically involves computer-based structure modeling of potential binding regions of the target molecule using databases of structural information. Eventually these techniques are used to derive a small subset of candidate molecules that are synthesized and evaluated to determine their reactivity with the target molecule and effect on its biological activity.
High-throughput screening approaches, in contrast, screen thousands of structurally diverse chemical compounds for binding activity with a selected target molecule in order to identify promising lead compounds that have potential as drugs. Most high throughput screening methods utilize an iterative process of screening a very large set of candidate compounds for activity, analyzing the results of the screen, and selecting a new set of compounds for additional screening based on properties elucidated from previous screens. Selection of compounds for additional screening is often driven by structure activity relationships (SARs) within the library screened compounds and using those relationships to further refine selection.
A large number of experimental strategies have evolved for target directed drug discovery via rational, structure-based drug design, high-throughput screening methods and the combination of these methods. A central component of each of these strategies, however, is sensitive methods of detecting and characterizing interactions between candidate molecules and target molecules. These methods must be capable of detecting a wide range of protein-ligand interactions, particularly weak protein-ligand interactions. Moreover, useful methods are also capable of characterizing protein-ligand interactions in terms of fundamental parameters important for evaluating the potential of a candidate compound to serve the basis of a drug therapy, such as the target binding affinity, inhibitory potential, region(s) of interaction, forward and reverse reaction rates and binding equilibrium constant. Methods that have been used to identify and evaluate lead candidate molecules include, fluorescent assays methods, mass spectrometry techniques, nuclear magnetic resonance (NMR) techniques, competitive binding assays, surface plasmon resonance methods and microarray functional assays.
Two-dimensional NMR (2D NMR) methods are currently a preferred technique for structurally characterizing proteins and probing interactions between target molecules and potential therapeutic candidates. 2D NMR methods differ from conventional one dimensional NMR methods in that more than one radio frequency pulse is applied to the sample, and the signal is measured as a function of direct and indirect time-delays. Fourier transform with respect to both the direct and indirect periods yields the two dimensional spectrum in frequency space having (i) diagonal peaks resulting from contributions of the magnetization that has not been changed by application of the additional radio frequency pulses and (ii) cross peaks originating from nuclei that exchanged magnetization during the mixing time subsequent to the application of the second radio frequency pulse. The intensity and position of cross peaks present in two dimensional NMR spectra indicate an interaction of two nuclei that exchanged magnetization and, therefore, contain additional, valuable information relating to structure. Other multidimensional NMR techniques, such as 3D NMR, have also be developed wherein a plurality of radio frequency pulses are delivered to a sample having pulse widths and time-delays selected to enhance the structure related information extracted from these measurements.
The introduction of an additional spectral dimension in 2D NMR spectroscopy results in spectra having additional structure related information. In addition, useful structural information may be more easily extracted from 2D spectra than in corresponding one dimensional NMR spectra of proteins, which are often extremely congested with many overlapping peaks. As a result of these advantages, a number of 2D homonuclear spectroscopic techniques have evolved for probing the structure of proteins, including 2D COSY, 2D TOCSY and 2D NOESY techniques which primarily differ in the pulses used during the mixing time. In addition, heteronuclear methods using 15N and 13C nuclei have also been developed as useful tools in elucidating protein structure. As the natural abundance of 15N and 13C is significantly lower than that of protons, these techniques often rely on isotope enrichment and enhancement of signal-to-noise ratios by use of inverse NMR methods wherein magnetization is transferred from protons to hetero nuclei.
In high-throughput and rational, structure based drug screening applications, 2D NMR spectra are generated corresponding to the target molecule or a labeled analog thereof in the absence of a therapeutic candidate molecule. In many applications, 2D 15N/1H heteronuclear single quantum correlation spectra are acquired because the 15N/1H signals corresponding to individual backbone amides of target proteins are often resolvable. Next, 2D NMR spectra are generated corresponding to the target molecule or a labeled analog thereof in the presence of a therapeutic candidate molecule, and compared to the spectra corresponding to the target molecule or a labeled analog thereof in the absence of a therapeutic candidate molecule. If measurable differences exist between the spectrum corresponding to the absence of the therapeutic candidate and the spectrum corresponding to the presence of the therapeutic candidate, a binding interaction may be inferred from the data. Furthermore, because shift values of 15N/1H signals in the 2D NMR spectra correspond to ascertainable locations within the target protein, quantitative analysis of the difference spectrum corresponding to the presence and absence of the therapeutic candidate may provide a means of identifying specific binding regions involved in the interaction. In some instances, a plurality of difference spectra corresponding to different concentrations of the candidate molecule may be analyzed to provide a measurement of the binding affinity and/or dissociation constant between a therapeutic candidate that binds with a target protein.
Although 2D NMR spectroscopy has been demonstrated to provide a useful screening method for identifying therapeutic candidate molecules that bind to proteins, these techniques are susceptible to certain drawbacks. First, 2D NMR spectra corresponding to proteins and protein mixtures take on the order of minutes (e.g. 10 minutes) to acquire. Furthermore, analysis of 2D NMR spectra requires operation of complex numerical simulating and fitting algorithms, which also take on the order of minutes to accomplish. As a result of these limitations, high-throughput screening candidate molecule libraries comprising thousands of compounds can take on the order of months to achieve. Second, the time resolution provided by 2D NMR spectroscopy is on the order of milliseconds and, thus, these methods are not capable of effectively detecting or characterizing transient binding interactions occurring on microsecond, picosecond and femtosecond timescales. Third, pulse radio-frequency beams used in NMR apparatus have longer wavelengths than other spectroscopic methods such as infrared spectroscopy, and so optical methods that simplify signal detection such as phase matching cannot be applied to NMR techniques. Finally, NMR is also a relatively insensitive technique and, thus, requires relatively large amounts of sample (about 0.1 to about 1 milliliters) as compared to optical spectroscopy techniques that are limited by the spot size (50 nanoliters). Such sample requirement considerations are compounded in NMR flow through experimental designs, wherein the relatively long sample intervals are needed to acquire a useful 2D NMR spectrum results in large sample volume requirements.
It will be appreciated from the foregoing that there is currently a need in the art for methods and devices for probing interactions involving biomolecules, such as proteins, peptides and DNA molecules that may serve as the basis of new drug therapies. Particularly, methods of screening interactions between proteins or components of proteins and potential therapeutic candidates are needed that are complementary to existing 2D NMR techniques. Screening methods capable of probing protein interactions occurring on sub-millisecond time scales are currently needed. In addition, improved methods of probing protein interactions requiring smaller sample volumes and shorter sampling intervals are needed to enable more efficient high-throughput screening applications.