The present invention relates to methods for high information content (HIC) analysis or screening of complex biological systems using Fourier transform mass spectrometry (FTMS). The present methods are useful for analyzing complex biological mixtures containing both high molecular weight molecules (e.g., polynucleotides, proteins, polysaccharides) and low molecular weight molecules (e.g., oligonucleotides, peptides, lipids, oligosaccharides, steroid hormones, catabolic and metabolic intermediates) permit the elucidation of molecular differences between complex biological samples, and permit the identification of biologically active molecules (e.g. therapeutically active drugs, etc.).
Mass spectrometry is an analytical technique measuring an atom""s or a molecule""s mass (referred to as atomic and molecular mass, respectively). Since molecular mass is the stoichiometric sum of the atomic masses for each element in the molecule, a characteristic measure is provided for each analyte having a different empirical formula.
The instrument used to measure molecular mass is known as a mass spectrometer. Typically, mass spectrometry is performed by volatilizing (in a gas phase) an analyte then ionizing an analyte and detecting signals. For most types of mass spectrometers, the detector consists of a type of electron multiplier. Ions impinging on such a detector create secondary electrons that register as some measurable current. In this respect, the FTMS instrument is uniquely different in that it measures ions indirectly and non-destructively by measuring an image current. The data generated in fine, i.e., a mass spectrum, has two coordinates: the mass-to-charge ratio scale (x-axis) and the intensity scale (y-axis).
The molecular masses of gas-phase ions, which are formed from both neutral and charged molecules, are determined based on their mass-to-charge (m/z) ratios. If further fragmentation of the gas phase ions is desired, this can be achieved by having them collide with gas molecules, so-called xe2x80x9ccollision-induced dissociationxe2x80x9d (CID). The subfragments that are generated are then also separated by mass.
In recent years, mass spectrometry has been exploited in a variety of biological contexts, including nucleic acid sequencing, peptide sequencing and identification (Keen and Findlay, xe2x80x9cProtein Sequencing Techniques,xe2x80x9d in Molecular Biology and Biotechnology, Robert A. Meyers, ed., VCH Publishers, Inc. 1995, p. 771; Carr and Annan, xe2x80x9cOverview of Peptide and Protein Analysis by Mass Spectrometry,xe2x80x9d in Current Protocols in Molecular Biology, Ausubel et al., eds., John Wiley and Sons, Inc., 1997, 10.21); detection of in vitro and in vivo protein post-translational modification and expression (Rowley et al., 2000, Methods 20:383-397); elucidation of protein tertiary structure (Last and Robinson, 1999, Curr. Opin. Chem. Biol. 3:564-570); study of labile, non-covalently associated biomolecules (Budnik et al., 2000, Rapid Commun. Mass Spectrom.14:578-584); disease diagnosis (Bartlett and Pourfarzam, 1999, J. Inherit. Metab. Dis. 22:568-571); surveillance of environmental contamination (Scribner et al., 2000, Sci. Total Environ. 248:157-167); agricultural screening (Hau et al., 2000, J. Chromatogr. 878:77-86); and forensic applications (Hollenbeck et al., 1999, J. Forensic Sci. 44:783-788; Gaillard and Pepin, 1999, J. Chromatogr. B. Biomed. Sci. Appl. 733:181-229).
Mass spectrometry, which provides femtomolar sensitivity and accuracy better than 0.01%, has emerged as an attractive alternative to chemical methods for peptide sequencing and identification. Sensitivity of mass spectrometry has been improved by using isotopically labeled peptides and combining a nanoelectrospray ion source with a quadrupole time-of-flight tandem mass spectrometer. This approach exploits an intrinsic feature of the quadrupole time-of-flight device, affording higher sensitivity and resolution than other types of mass spectrometers (Shevchenko et al., 1997, Rapid Comm. Mass Spectrom. 11:1015-1024). Isotopic labeling of C-terminal peptide fragments, e.g., by enzymatic digestion of a protein in 1:1 16O/18O water, provides a characteristic isotopic distribution for these fragments that can be readily identified (Schnolzer et al., 1996, Electrophoresis, 17:945-953); thereby revealing the amino acid sequence.
Mass spectrometry can also be used to study a protein""s structure. This technology can provide accurate molecular masses for minute quantities of proteins of interest with masses up to 500,000 Daltons (xe2x80x9cDaxe2x80x9d). The resulting spectra also can help determine protein folding, protein self-association and other conformational changes and tertiary structure (Nguyen et al., 1995, J Chromatogr A 705:213-45). In addition, co- and post-translational modifications of proteins can be identified and mapped. This method is preferable to using chemical methods such as C-terminal sequencing, which requires relatively harsh sample treatment that can alter or destroy such protein modifications. Post-translational modifications that can be identified using mass spectrometry include phosphorylation, glycosylation, deamidation, isoaspartyl formation, and disulfide-bond formation.
Mass spectrometry has also found important applications in the study of protein-protein interactions. Target proteins can be followed in vivo to document their conformational changes, active site usage, ligand recognition, assembly into multimeric complexes (e.g., holoenzymes), and trafficking among organelles.
Fourier transform mass spectrometry (FTMS) is also known as Fourier transform ion cyclotron resonance (FTICR). The principle of molecular mass determination used in FTMS is based on a linear relationship between an ion""s mass and its cyclotron frequency. In an uniform magnetic field, an ion will process about the center of the magnetic field in a periodic, circular motion known as cyclotron motion. An ensemble of ions having a particular mass-to-charge ratio (m/z) can be made to undergo cyclotron motion in-phase, producing an image current. The image current is detected between a pair of receive electrodes, producing a sine-wave signal. The Fourier transform is a mathematical deconvolution method used to separate the signals from many different m/z ensembles into a frequency, also known as mass, spectrum. Unlike other forms of mass spectrometry, FTMS is non-destructive. For a general review of FTMS, see Hendrickson and Emmett, 1999, Ann. Rev. Phys. Chem. 50:517-536. The application of FTMS to biological sciences is generally similar to other mass spectrometry applications. See, e.g., Smith et al., 1996, xe2x80x9cThe Role of Fourier Transform Ion Cyclotron Resonance Mass Spectrometry in Biological researchxe2x80x94New Developments and Applicationsxe2x80x9d in Mass Spectrometry in the Biological Sciences eds. A. L. Burlingame and S. A. Carr, Humana Press, Totowa, N.J.; McLafferty, 1994, Acc. Chem. Res. 27:379-386.
A number of researchers have started evaluating the use of FTMS in the analysis of biological samples; see Jensen et al., Electrophoresis 2000 21:1372-1380; Jensen et al., Anal. Chem. 1999 71:2076-2084; Palblad et al., Rapid Comm. Mass Spec 2000, 14:1029-1034; WO 95/25281; WO 00/29987; WO00/03240; WO99/58727; WO99/57318; WO99/46047; Li et al., Anal. Chem. 1999 71:4397-4402; Penn et al., Anal. Chem. 1997; 669:2471-2477; and U.S. Pat. Nos. 6,017,093 and 4,224,031.
Analytical methods useful in drug discovery are primarily based on individual end-point observations. The targeting of specific biological interactions (e.g., receptor-ligand, substrate-enzyme) for xenobiotic intervention has been a common paradigm for mining chemical libraries. The traditional approach of choice for drug discovery by pharmaceutical, biotechnology and genomics companies is classical high throughput screening (HTS), which entails parallel screening of large chemical libraries on single targets using generally cell-free assays. Chemical libraries used in HTS are most often generated using combinatorial chemistry. Collections of chemicals obtained from natural sources or generated using xe2x80x9cconventionalxe2x80x9d chemistry are used to a lesser extent in HTS.
The HTS approach is premised on validated targets, usually proteins (e.g., enzymes, receptors, transfer proteins) or nucleic acids (genes, mRNAs). Therefore, the target protein or nucleic acid used in screening by HTS generally is known and thought to play a role in the diseased state. Only then are modulators of the target protein sought as lead compounds for drug development. Workers have conducted HTS on targets only to find later that the target protein was irrelevant to the disease. For example, because receptors can exist in the form of different subtypes, only one of which may be critically essential, a knockout mouse targeting the wrong receptor subtype would likely fail to show a relevant phenotype. It is becoming clear that many biological functions are supported by redundant biochemical pathways. When a pathway fails, redundant mechanisms take over. Many drugs developed on the basis of a defined target show little to no therapeutic activity in vivo because redundant biochemical pathways bypass the pathway in which the target is involved.
For HTS to be successful, the targets usually require an appropriate cellular environment or biological context. For example, a membrane receptor should be in a membrane similar to that in which the receptor is normally found; otherwise, the receptor""s properties may be artificially affected. A suitable membrane setting may require reconstituting the membrane with the appropriate lipids. Reconstitution of the suitable membrane environment is the most challenging task in such situations, because of a lack of sufficiently detailed knowledge of the components of such an environment, or because of the complexity of the natural membrane setting.
Additionally, successful classical HTS requires knowledge of the mechanism of the disease or disorder of interest, allowing the selection of a suitable target for validation and, eventually, screening. In the absence of such detailed knowledge, classical HTS cannot be performed.
Another limitation of the technique is that HTS based on a validated target uncovers modulators only of that target. Ultimately, the costly and laborious screening procedure can, at best, provide a small subset of potential test compounds.
Therefore, a method that allows unbiased, simultaneous screening for modulators of multiple, unvalidated targets in their natural environments would greatly improve the pace of drug discovery, while reducing costs. In particular, the identification of small molecules that are present in abnormal amounts in specific states (disease states, development states, differentiation states, etc.) should facilitate the design of analogs, agonists or antagonists of these molecules, leading to the rapid identification of high specificity drugs including but not limited to pharmaceutical drugs, drugs for veterinary applications, drugs for agricultural applications (weed killers, parasite/insect killers, phytohormone agonists, etc.) and drugs for environmental uses (bacterial killers, bacterial proliferators for oil spill cleaning, mussel proliferation controllers, algae proliferation controllers, etc.).
xe2x80x9cBioinformaticsxe2x80x9d generally refers to the management of biological data using computational means, including data storage (registration of data in an effective way to easily retrieve information) and data analysis using computer intensive mathematical calculations (statistical analysis, non-linear analysis, etc.). Bioinformatics is intensely used to determine structure-activity relationships using the large amount of data generated using High Throughput Screening and Combinatorial Chemistry in order to design more effective biologically active molecules. The state of bioinformatics has evolved from needing to organize and make accessible the glut of gene sequence information that has become available in the past two decades. While initially used to catalog normal gene sequences, bioinformatics is expanding to encompass the identification of protein structures based on pattern recognition in primary sequences and gene expression data obtained using microarrays (see, e.g., http://www.ebi.ac.uk).
Methods for gene-expression profiling useful to identify gene products that are differentially expressed or regulated in different cell types (e.g., as a means to identify their function) include differential display, serial analysis of gene expression (SAGE), nucleic acid array technology, subtractive hybridization, proteome analysis, and mass-spectrometry of two-dimensional protein gels. Methods for gene-expression profiling are exemplified by the following references, which describe differential display (Liang and Pardee, 1992, Science 257:967-971), proteome analysis (Humphery-Smith et al., 1997, Electrophoresis 18:1217-1242; Dainese et al., 1997, Electrophoresis 18:432442), SAGE (Velculescu et al., 1995, Science 270:484-487), subtractive hybridization (Wang and Brown, 1991, Proc. Natl. Acad. Sci. U.S.A. 88:11505-11509), and hybridization-based methods of using nucleic acid arrays (Heller et al., 1997, Proc. Natl. Acad. Sci. U.S.A. 94:2150-2155; Lashkari et al., 1997, Proc. Natl. Acad. Sci. U.S.A. 94:13057-13062; Wodicka et al., 1997, Nature Biotechnol. 15:1259-1267).
Genome sequencing projects, such as The Human Genome Project, have created large databases of gene sequences. Biological function, however, cannot be determined solely from nucleotide sequence data, but rather must ultimately be established by studying the gene products at the level of the protein. Only by studying the protein itself can its expression pattern, association with other molecules in vivo, and its role in normal and diseased tissue be recognized. Recognizing these shortcomings of genomics, scientists have adopted the xe2x80x9cProteomicsxe2x80x9d approach. The field of proteomics has advanced by utilizing two-dimensional polyacrylamide gel electrophoresis (2-D PAGE), which is capable of resolving thousands of proteins according to their charge and mass. The resulting protein patterns are then compared, and attempts are made to assign unique patterns to particular cell types or disease states. However, 2-D PAGE can fail to resolve the large number of proteins present in complex samples, and the technique is time consuming, labor intensive and expensive. In addition, 2-D PAGE may also significantly fail to detect low abundance proteins. 2-D PAGE has a relative low dynamic range, particularly as compared to FTMS.
Citation or discussion of a reference herein shall not be construed as an admission that such is prior art to the present invention.
In accordance with the objects outlined above, the present invention provides methods comprising comparing a FTMS peak profile of a first biological sample derived from cells that have not been exposed to a candidate bioactive agent to an FTMS peak profile of a second biological sample derived from a cell that has been exposed to the candidate bioactive agent.
In a further aspect, the present invention provides methods comprising contacting a first population of cells with a first candidate bioactive agent and subjecting the first population of cells to FTMS analysis to obtain a first peak profile. The first profile is compared to a reference profile from the first population of cells in the absence of the first agent.
In an additional aspect, the present invention provides methods comprising subjecting a first population of cells to FTMS analysis to obtain a first peak profile comprising a plurality of peaks, wherein at least two peaks correspond to different types of biomolecules.
In a further aspect, the present invention provides methods comprising a population of cells comprising at least a first and a second subpopulation of cells and contacting the first subpopulation of cells with a first candidate bioactive agent. The second subpopulation of cells is contacted with a second candidate bioactive agent and subjected the first and the second subpopulation of cells are subjected to FTMS analysis to obtain a first and a second peak profile, respectively. The first and said second peak profiles are compared to a reference profile from the population of cells in the absence of the agents.
In an additional aspect, the present invention provides methods comprising contacting a first population of cells with a drug and subjecting the population of cells to FTMS analysis to obtain a peak profile. The profile is compared to a reference profile from said population of cells in the absence of said drug.
In a further aspect, the present invention provides methods comprising providing a population of cells comprising at least a first and a second subpopulation and contacting the first subpopulation of cells with a drug at a first concentration and contacting the second subpopulation of cells with a drug at a second concentration. The first and second subpopulations of cells are subjected to FTMS analysis to obtain a first and a second peak profile, respectively. The first and second peak profiles are compared to identify at least one peak that differs in intensity, which peak does not correspond to the drug.
In an additional aspect, the present invention provides methods comprising subjecting a first population of cells to FTMS analysis to obtain a first peak profile and subjecting a second population of cells to FTMS analysis to obtain a second peak profile, wherein said first and second populations are of different cell types. The first and second peak profiles are compared to identify at least one peak that differs in intensity.
In a further aspect, the present invention provides methods to use SAR (software activity relationship) software in combination with FTMS analysis to generate chemical hypotheses and create new biologically active molecules.
In an additional aspect, the present invention provides methods of determining the components and pathways of disease states. The methods comprise subjecting a population of cells from an organism with a disease state to FTMS analysis to obtain a first peak profile. The peak profile is then compared to a reference profile from cells from an organism without the disease state, or to cells from the same organism from a non-disease tissue. The comparison results in the identification of at least one peak that either differs in intensity or is present in one profile but not the other. The cellular component that gives rise to this peak is then identified. This information can be used in a variety of ways. Databases can be searched for the binding partners of the cellular component to elucidate the cellular pathways of the disease state. The cellular component or its binding partners can be used in screens for drug candidates.
In a further aspect, the invention provides screening methods for the discovery of new drugs. The methods comprise the use of any number of prescreening methods comprising adding candidate agents to cells and screening for altered phenotypes. Cells exhibiting altered phenotypes are then subjected to FTMS analysis and relevant peaks identified. Alternatively, once peak profiles of desirable effects are generated, screening for candidate drugs, such as those generated in structure-activity relationship (SAR) studies that mimic these desirable peak profiles can be done.
In an additional aspect, the present invention provides methods for de novo drug design. The methods include generating a plurality of FTMS analyses on a limited set or library of candidate compounds. The resulting peak profiles are then compared to desirable peak profiles (e.g. those that have been generated using known drugs or disease-free cells) to identify xe2x80x9cshapesxe2x80x9d, xe2x80x9cpharmacophoresxe2x80x9d or xe2x80x9cactive sitesxe2x80x9d that are relevant. The results can then be screened against virtual chemical libraries to identify additional compounds for testing in traditional and FTMS screening.