The invention relates generally to peptide molecules and to methods of designing peptides or peptide-like molecules. More particularly, the invention relates to novel, short peptides or peptide-like molecules which have a high probability of binding to and/or otherwise modulating the function of polypeptides or proteins, and to methods for designing such peptides or peptide-like molecules.
All protein sequences, whether peptides, polypeptides, or proteins, are composed of a linear sequence of amino acids joined by peptide bonds. There are twenty naturally occurring amino acids, each bearing a chemically unique side chain. Determinants of polypeptide interactions, such as those between peptide segments in protein folding or between protein monomers, are encoded in the one-dimensional sequence of these twenty amino acid side chains. For purposes of this application, xe2x80x9cpeptidesxe2x80x9d are generally considered to be amino acid polymers of not more than 25 amino acids in length; xe2x80x9cpolypeptidesxe2x80x9d are generally considered to be polymers of between 25 and 50 amino acids; and xe2x80x9cproteinsxe2x80x9d are generally considered to be polymers containing more than 50 amino acids. One of ordinary skill in the art would appreciate that some overlap among these ranges is expected, and minor deviations from these ranges does not in any way diminish the scope of the invention. The xe2x80x9cnaturally occurring amino acidsxe2x80x9d are those that are encoded for in the genetic code, and which are generally considered to be those found in all living species to date.
Net differences in the cumulative energetic contributions of several types of weak bonding mechanisms, totaling as little as xcex94G=5-10 kcal/mol, determine selection and stabilization among conformations observed in protein folding, protein-protein interactions and the initial phases of substrate-enzyme and ligand-membrane receptor association. In particular, the minimization of xcex94G through the formation of four general types of weak bonding mechanisms between amino acid side chains, in the range of xcex94G≅2-7 kcal/mol, determines the arrangement of protein sequences in three-dimensional space, as well as the relative orientations of protein chain aggregates, in aqueous environments and at physiological temperatures. The thermal instability of the conformations supported by these low xcex94G, reversible, weak-bonding mechanisms permits uncatalyzed, fast searches of configuration space for functionally optimal cooperative arrangements within and between polypeptide and protein monomers. The variety of weak bond capacities afforded by amino acid side chains determines the range of the amino acid sequences"" physicochemical property transformations listed in this invention.
The weak bonds ordering polypeptides and proteins in three-dimensional space include hydrogen bonds, such as the main chain amino acid carbonyl and imino groups, which configure the right-turning xcex1-helices and the parallel and antiparallel xcex2-sheets. They also include the hydrogen and ionic bonds between amino acid side chains, such as the hydroxyl groups of serine and threonine, the acidic carboxyl groups of aspartate and glutamate, and the basic groups of lysine and arginine. In addition to being distinct with respect to the chemical group, these weak hydrogen and ionic bonding influences are also directionally specific, with bonding angles greater than 30xc2x0 reducing their influence to negligible levels.
A third but nondirectional type of weak bonding interaction, induced by fluctuating charges within a distance of 1-3 xc3x85, is called van der Waal forces. These interactions vary with the size and the extent of mutual geometric fit, but are in the range of 1-2 kcal/mol. These forces are barely greater than those due to the heat of molecular motion at room temperature (xcex94G≅0.6-1.0 kcal/mol). However, in the specific cases of some antibody/antigen interactions and MHC protein/peptide interactions, which involve water-releasing tight fits between corresponding moieties in suitably shaped binding pockets, the xcex94Gs associated with van der Waals interactions have been estimated to be as high as 30 kcal/mol.
A fourth weak bonding mechanism, and the most energetically dominant force on three-dimensional polypeptide structure and protein-protein interactions, is termed the hydrophobic effect. The hydrophobic effect arises from the much stronger attraction that water molecules have for each other than for hydrocarbon groups or molecules. Each tetrahedrally-coordinated water molecule participates in strong, hydrogen-bonded, dipole/dipole interactions with other water molecules that are manifested in the properties of water such as its high surface tension, high latent heat and high boiling point. These physicochemical features of water molecules afford a large variety of possible atomic arrangements of water (as seen in the large number of different ice types) that in turn permit maximizing the entropy and minimizing the free energy of the aqueous solution. Spatially distributed (nondirectional) deformations in these hydrogen-bonded arrangements of water result from the intrusion of nonpolar, hydrophobic solutes. The introduction of such molecules into an aqueous solution results in the formation of volume-expanding hydration shells composed of hydrogen-bonded cages of multiple molecular layers of water (xe2x80x9cclathrate structuresxe2x80x9d) around these molecules, in a process called xe2x80x9chydrophobic hydrationxe2x80x9d. In aqueous solutions, such deformations in water structure are energetically unfavored. For example, the side chains of alanine, valine, leucine and isoleucine are without effective dipole moments, and therefore cannot participate in charge-mediated or hydrogen-bonding interactions with water. As a result, these side chains intrude into the aqueous solvent and disrupt the ordered structure of the aqueous solvent, resulting in an increase in the overall xcex94G. Amino acids with polar but uncharged side chains, such as serine and threonine, may hydrogen bond with a molecule of water, but otherwise undergo the same kind of hydrophobic hydration as the non-polar side chains. In the case of amino acids with side chains containing charged groups, such as glutamate or lysine, the electrostatic fields associated with these side groups are screened by water molecules, such that in an aqueous solution hydrophobic hydration is still a prominent characteristic of these amino acids as well. The nonlocal, cooperative interactions of the hydrogen bonds of the aqueous solvent surrounding these amino acids drive the in-line, surface-minimizing attraction between the coherent hydrophobic-phase patches of amino acid side chains, thereby maximizing the entropy, and minimizing the free energy, of the overall aqueous solution.
The importance of the sequential arrangements of amino acid side chain hydrophobicities in the determination of peptide and protein secondary structures has been established knowledge in protein biology for many decades. The ready availability of water for compensatory weak bonding implies that relatively small changes in xcex94G occur when internal peptide backbone-related, carbonyl-imino hydrogen bonding or side chain polar groups are not satisfied. This-contrasts with the much greater alteration in xcex94G associated with loss of internal hydrophobic bonding, which cannot be compensated by the hydrophobically disrupted, aqueous environment. Minimization of hydrophobic free energy, xcex94Ghp, by water interface-reducing aggregation of nonpolar, hydrophobic amino acid side chain groups adds to the xcex94G of binding that can, collectively, be orders of magnitude larger than that predicted by van der Waals theory. Mutually attractive forces mediated by hydrophobic surface minimization have been measured by atomic force spectroscopy to extend to as great a distance as 60 xc3x85, the length scale of synaptic gaps. These attractive forces decay less than exponentially with distance. The contribution to the energy of stabilization of the three-dimensional, tertiary structure of protein by xcex94Ghp minimization due to aggregation of hydrophobic amino side chains has been estimated to be in the range of 70%.
Complete substitution of hydrophobically equivalent amino acids in peptides maintains and sometimes increments their peptide-receptor mediated physiological potency. Additionally, proteins which are dominated by helical secondary structures of specific turn lengths can be designed using sequences of amino acids of high and low hydrophobicities, independent of the specific amino acids chosen within each hydrophobicity class. In contrast, regions of amino acids characterized by interactions dominated by hydrogen bonds, ionic bonds, and van der Waals interactions are often exquisitely sensitive to any substitution, even those deemed to be conservative replacements. This difference between the effects on xcex94G of hydrophobic interactions versus those of hydrogen bonding, ionic binding or van der Waals interactions, along with more stringent geometric requirements of the latter compared with hydrophobic weak bonds, make sequential patterns of xcex94Ghp in polypeptide sequences of primary importance in determining peptide-peptide or peptide-protein interactions.
Previously, the role of the hydrophobic interactions of amino acids in peptide ligands with amino acids in their associated membrane proteins have been considered in structure-function analyses in two ways. First, the local roles of amino acids have been evaluated. In these studies, ligand-receptor binding is changed by point mutations in specifically positioned amino acids, producing alterations in the hydrophobic characteristics of xe2x80x9cbinding pocketsxe2x80x9d involving neighboring but nonsequential juxtapositions of residues brought together in the protein""s cooperative tertiary structure. Second, the global effects of amino acids have been examined. These effects are often studied using chimeric exchanges, with respect to the number, lengths, and locations of transmembrane segments of receptors, transporters, and/or channels, and exploit the sequential juxtapositions of amino acid hydrophobicities, using n-point window moving averages to generate what are commonly known as xe2x80x9chydropathy plotsxe2x80x9d. The largest, longest positive variations in these smoothed hydrophobic amplitude graphs across sequence-indexed location of membrane proteins are interpreted as the lipophilic, hydrophobic transmembrane segments of the membrane protein. The best-studied example of this approach is the finding of seven sequential hydrophobic maxima of approximately 25 residues each in the hydropathy plots of bacteriorhodopsin, assumed to be the evolutionary prototype of the G-protein gene superfamily of transmembrane receptors. This common transmembrane receptor protein motif comprises copolymers of seven transmembrane domains that snake back and forth across the lipid bilayers of membranes, anchored by lipophilic transmembrane (xe2x80x9cTMxe2x80x9d) segments. In this motif, three separate extracellular loops (xe2x80x9cELsxe2x80x9d) are defined by the TMs: the first extracellular loop, EL-I, between TM2 and TM3; the second extracellular loop, EL-II, between TM4 and TM5; and the third extracellular loop, EL-III, between TM6 and TM7.
Secondary structures with matching wavenumbers, such as the xcex2-strands of interleukin-1xcex2, have been shown to bind together and initiate protein folding in a process called the xe2x80x9chydrophobic zipperxe2x80x9d. We define xe2x80x9cwavenumbersxe2x80x9d as the inverse spatial variational frequencies of a physicochemically transformed series. They are reported here in sequential distance units of amino acids. Two long, helical secondary structures with congruent hydrophobic wavenumbers bind to create the central xe2x80x9chydrophobic knotxe2x80x9d that stabilizes the structure of phospholipase A2. Recent studies of the binding of extracellular domains of growth hormone receptor by polyclonal antibodies to ovine growth hormone have shown that functional binding occurs between the epitope sequences and the extracellular segments of the growth hormone transmembrane receptor. This binding, analogous to that between peptide ligands and their receptors, is more related to common helical, loop and/or disordered secondary structures than to specific amino acid sequences or their local three-dimensional geometry.
Estimates of the relative contributions by the xcex94Ghp of each of the twenty amino acids to these weak bond-mediated reactions can be approximated as the free energy of transfer from aqueous to organic phases of each of the amino acids in a binary solution. Values for the free energy of transfer are measured as the relative equilibrium partitions Keq=exe2x88x92xcex94Ghp/RT, expressed in kcal/mol, in these aqueous-organic binary solvents. The transformation of individual amino acids into their xcex94Ghp values enables the conversion of polypeptide and protein sequences into real number series available for analyses with respect to matches in sequential patterns. These have been predictive of differentially selective hydrophobic attraction and aggregation between peptide ligands and relevant extracellular receptor loops following their search via xe2x80x9csnake upon snakexe2x80x9d sliding diffusion, or xe2x80x9creptationxe2x80x9d.
A topologically one-dimensional polypeptide sequence manifests secondary structures, which are organized into supersecondary structures and further into tertiary structures. For example, spiral rotations of ≈3.6 amino acids are the elementary component of a helical barrel comprised of 12-16 amino acids. These helical barrels may be joined by short loops into four-barrel bundles comprised of 60-70 amino acids, which may in turn be part of a protein domain containing several hundred amino acids and forming sequentially segregated or alternating barrels, bundles, xcex2-sheets and coils and loops of varying lengths. Therefore, hydrophobic sequences of a range of lengths may underlie the conformational components of different sizes and complexity that comprise the compact intermediate states of proteins.
Transformations of polypeptide sequences into xcex94Ghp values have been found useful in predicting polypeptide chain turns composing secondary structures, such as xcex1-helices and xcex2-strands. These predictions have been confirmed by x-ray crystallographic studies. Generic xcex1-helices are ≈5.4 angstroms long with 3.6 amino acids per rotation resulting in ≈1.5 angstrom linear distance per residue. Generic xcex2-strands have 2.1 amino acids per turn with ≈3.3 angstroms linear distance per residue.
Sliding window xcex94Ghp averages were shown to be able to locate the lipophilic, hydrophobic transmembrane segments of membrane proteins, and these results were confirmed using low- and high-resolution crystallographic studies of bacteriorhodopsin as a model seven-transmembrane receptor protein. It is generally accepted that representation of polypeptide sequences as a series of amino acid aqueous volumes, partial specific volumes or xcex94Ghp, followed by n-block averaging, statistical predilection, hydrophobic moments, Fourier transformation, helical wheel plots or wavelet transformations can predict the size and locations of secondary and transmembrane structures in soluble and membrane proteins 60-80% of the time. These approaches have also been found useful in predicting supersecondary structures, such as the four-helix barrels and the supercoiling of xcex1-helical structures about each other in fibrous proteins, such as the keratins and myosin tails. However, one drawback of these methods is that coexisting sequential variations in hydrophobic free energy wavelengths (mode or modes) other than that of transmembrane segments are lost in the generation of hydropathy plots by smoothing. Moreover, conventional Fourier transformation of the protein""s hydrophobicities results in poor mode definition, because of end effects and intrinsic multimodality. In addition, these conventional techniques have thus far provided no solution of what is called the xe2x80x9cinverse problemxe2x80x9dxe2x80x94that is, even if the conventional methods were able to define one or more given signatory and relevant modes, how does one construct a de novo peptide using these modes? The present invention overcomes the deficiencies of the prior art, and describes successful solutions to the inverse problem.
When the amino acid sequences of neuropeptides and peptide hormones were transformed into their individual xcex94Ghp values, functionally related peptides demonstrated similarities in hydrophobic free energy power spectral mode or modes. Functionally related peptide family members share the same statistically significant dominant power spectral wavelengths (wavenumbers expressed as inverse spatial frequencies), though differing in their ordered amino acid content by as much as 60%. The power spectral wavelengths are expressed in units of amino acid residues as h(xcfx89). For example, glucagon, vasoactive intestinal peptide, secretin, oxytomodulin, helodermin and growth hormone releasing factor, which share several (but not all) physiological actions and which have differing relative potencies, share a h(xcfx89)=4.0. The range of peptide hydrophobic modes found by the power spectral transformation of amino acid sequences as hydrophobic free energies includes the well known h(xcfx89)=3.6 and h(xcfx89)=2.0 of the xcex1-helix and the xcex2-strand, respectively, but many others as well, ranging from the h(xcfx89)=13.10 amino acid residue of acid fibroblast growth factor to the h(xcfx89)=2.18 which dominates the hydrophobic free energy power spectrum of corticotropin releasing factor.
The HIV coat protein manifests a waxing and waning of h(xcfx89)=7 to 9 (observed by sliding a 50-residue windowed Fourier transform along its sequence), which appears to be conserved across many of its mutations. Fibroblast growth factor (xe2x80x9cFGFxe2x80x9d) was predicted and confirmed to have a regulatory influence on the enzyme ribonuclease A, with which it was found to share dominant hydrophobic mode. This mode match led to experiments that demonstrated an increased half-life of messenger RNA in the presence of FGF in a neuroendocrine cell line.
The specific amino acid sequences of the calcitonins, the peptide hormone family that regulates the rate of enzymatic bone catabolism, vary by approximately 60% across species, but all are dominated by an h(xcfx89)=3.6. The most potent calcitonin (from salmon) expresses this mode with a significantly lower hydrophobicity per residue (due the presence of a higher number of charged groups) than those of nine other species examined. The same h(xcfx89) can be expressed across differing average hydrophobicities of the amino acid sequences of peptides and receptors.
Using a variety of techniques involving linear decomposition and transformation of the xcex94Ghp sequences, we have obtained diagnostic graphical patterns of known and novel proteins with weak or unknown homology, polyproteins which have multiple functional segments following post-translational processing, and discriminable subtypes in membrane pore, channel and transporter proteins. These methods, which decompose xcex94Ghp series into their hierarchical levels of organization to yield secondary and supersecondary patterns at multiple wavelengths and/or length scales, include a variety of wavelet transformations, eigenvalue decomposition of autocovariance matrices and all poles, maximum entropy power spectra. Using xcex94Ghp sequences as input, these methods elucidated primary and secondary wavenumbers and the sequential order of these multiple hydrophobic modes which, when taken together, can contribute to the preliminary classification of unknown proteins into families or provide clues to their function.
Using these techniques, we have located peptide-receptor mode matches in the ELs of seven-transmembrane proteins, in the vicinity of neurotransmitter and pharmacological binding domains suggested by studies of point mutations and chimeric exchanges. The ligands designed for mode-matched hydrophobic aggregation at these sites are postulated to have modulatory (e.g. allosteric and/or direct) influences on the physiological activities induced by the corresponding membrane protein""s native ligands. In addition, mode matches were found between the xcex1-estrogen receptor and a known peptide antagonist; between a nuclear membrane docking site on a nuclear factor of activated T-cells and the known ligand calcineurin; and between the protein chaperonin GroEL and xcex2-lactamase, which is known to be bound by GroEL.
Eigenfunctions of autocovariance matrices of lagged xcex94Ghp sequence data matrices, maximum entropy power spectra and wavelet transformations were used as linear decompositions to remove the longer xcex94Ghp sequence wavelengths of various receptor TMs, leaving the shorter wavelength hydrophobic modes for analyses. Matches as statistical patterns in xcex94Ghp modes were found between peptide ligands and their membrane receptors, including kappa, mu, delta and orphan opiate receptors, corticotropin releasing factor receptor, cholecystokinin receptor, neuropeptide Y receptor, somatostatin receptor, bombesin receptor, and neurotensin receptor. Functionally significant mode matches also occur between peptides and non-peptide receptors and other proteins. For example, xcex94Ghp mode matches, such as those found between the dopamine co-localized neuropeptide neurotensin and the D2 dopamine membrane receptor, D2DA, and those found between the gastrointestinal and brain peptide cholecystokinin and the dopamine membrane transporter, DAT, predicted the differential binding of the pharmacologically active ligands to their respective responsive dopamine membrane receptors and, correspondingly, their lack of binding to the opposing, pharmacologically unresponsive dopamine membrane receptors.
We have proposed that functional interactions of peptides and biogenic amines may occur via selective hydrophobic aggregation of these peptides with mode-matched ELs on a target membrane protein. These interactions may result in heterosteric modification of the global kinetic conformations of the target membrane protein, and thereby produce responses to native or pharmacological ligands, distant from intramembranous ion- or charge-mediated active sites. We have modeled the joint actions on a single membrane protein as the shifting of the critical hydrophilic-hydrophobic partition between extra- and intramembranous portions of the TMs of receptors by peptide-receptor loop hydrophobic weak bond binding. This would facilitate (or retard) the first-order phase transition of native ligand induced-receptor membrane internalization, where low dielectric constant, unscreened ionic and/or charge-mediated tight binding most likely occurs. This theory contrasts with another suggesting that receptor-mediated interactions between co-localized biogenic amines and neuropeptides, such as dopamine and cholecystokinin, result from convergent intramembranous signaling through two receptors, one for each ligand, via the cooperative interactions between their membrane receptor proteins which result in G-protein mediated second messenger cascades.
Peptides are known to mediate a variety of physiological responses in many organisms, including man. Among these bioactive peptides are the peptide hormones, such as glucagon and insulin, which regulate glucose levels in the blood; gastrin and secretin, which control digestive processes; and follicle-stimulating hormone (FSH) and leuteinizing hormone, which regulate reproductive processes. Other bioactive peptides act as growth factors, including somatotropin (growth hormone), erythropoietin, and NGF (nerve growth factor).
Because of the powerful and specific effects of these peptides, they have long held great interest as drug candidates. For example, insulin is widely used to combat diabetes, and erythropoietin stimulates red blood cell formation. However, peptides have numerous drawbacks as potential therapeutics. Peptides are very unstable and sensitive to changes in their environments, which can create alterations in their structures and reduce or eliminate their physiological effects. Furthermore, peptides are susceptible to proteolysis, which complicates the problem of delivery to the desired site in the body and limits the available routes of administration. The available routes of administration are further limited by the relatively large sizes of many peptides, which make transdermal or inhalation administration methods impractical. Because peptides typically interact with other peptides or proteins to produce their biological effects, and the in vivo interactions between even a simple peptide and another protein are extraordinarily difficult to understand, enormous effort is required to determine the interactions between such molecules, or even to predict if such interactions will occur. Finally, relatively few bioactive peptides are known, in comparison to the number of potential polypeptide targets that mediate biological effects. As a result, there is great interest in finding methods to predict sequences of peptides that will interact with a polypeptide/protein target, and produce a desired physiological response. The present inventors have made the revolutionary discovery that peptides, in interaction with solvent-accessible proteins, also influence the behavior of proteins (as above) that are not specific peptide receptors.
The difficulties associated with predicting the structure of peptides that would produce a given effect in the body have led to the adoption of various combinatorial approaches. These methods produce large numbers of peptides having randomly generated sequences. The peptides are then subjected to various high-throughput screening methods to detect those peptides that may warrant further study. However, without prior knowledge of a relevant sequence pattern, often called a peptide pharmacophore, and without proven methods of pattern-conserving design, finding physiologically active lead compounds in applications involving peptide-protein interactions using purely random combinatorial searches is generally a low probability event. Depending on the candidate peptide length, the statistical expectations with respect to hits in at least micromolar concentrations using high throughput screening of xe2x89xa7300,000-400,000 component peptide libraries generated by parallel synthesis and combinatorial strategies, can be less than 2-4 per 100,000 peptides. Detection of these candidate peptides requires costly and time-consuming high-throughput methods for both peptide synthesis and for screening of the peptides. As a result, there is a great need for a method that can produce peptides or peptide-like drugs having a high probability of binding, modulating the activity of, activating or inhibiting a target polypeptide and/or protein.
The present invention relates to entirely new methods of designing peptides or peptide analogue molecules capable of binding to and/or otherwise modulating the function of protein targets having known amino acid sequences. The methods employ three kinds of templates, derived from analyses of the target protein sequences, in addition to relevant distributions of amino acids, for weighted and constrained random assignments to the templates to produce the peptides. Protein targets suitable for use in the present invention include cell membrane receptors, nuclear membrane receptors, circulating peptide and non-peptide receptors, membrane and circulating transporters, enzymes, chaperonins and chaperonin-like proteins; antibodies, surface proteins of infectious agents, and more generally, any protein involved in peptide-protein and/or protein-protein interactions. The peptides are designed to bind to and/or otherwise modulate, activate and/or inhibit the function of the target protein. The kinetic influence of the algoritimically-designed peptides on target protein function may be direct, competitive, uncompetitive, noncompetitive and/or allosteric in character. The templates are derived from at least one of the following: 1) eigenvectors of the autocovariance matrices of the physicochemically transformed amino acid sequence of the target protein; 2) wavelet subsequence templates derived from a variety of wavelet transformations of the physicochemically transformed amino acid sequence of the target protein; and 3) redundant subsequence templates computed from the physicochemically transformed amino acid sequence of the target protein. In the methods of the present invention, the constituent amino acids employed in synthesis of the peptide are partitioned into a finite number of groups, based on similarities in values of a physicochemical property. Thereafter, the amino acids are randomly assigned to the peptide, based on matching the physicochemical mode of the template derived from the target protein amino acid sequence. Partitioned amino acid distributions for random assignments to the similarly partitioned templates may be weighted by, for example, consideration of amino acid distribution in a variety of extra- and/or intracellular physiologically relevant pools or alternatively, such distributions in regions in the target protein sequence relevant to the construction of the templates. The physicochemical transformations of each of the amino acids in the target protein sequence may be based on, for example, hydrophobic free energy, relative vapor pressure, relative free energy of amino acid transfer into bulk phases, aqueous molar volume, aqueous surface area, aqueous cavity surface area, partial specific volume, relative charge, relative mass (in daltons), volume, pKa, relative diffusivity, relative frictional coefficient, relative chromatographic mobility, relative electrophoretic mobility, and/or memberships in categorical amino acid families such as polar, uncharged, polar charged, basic-positively charged, acidic-negatively charged and sulfur containing. Sequential pattern (xe2x80x9cmodexe2x80x9d) matches between candidate algorithmic peptides and their target proteins are designed such that when examined by maximum entropy, all poles, power spectral transformations and/or wavelet transformations, they yield peaks with wavenumbers that differ by 10% or less of the larger wavenumber value. As noted above, wavenumbers are the inverse spatial variational frequencies of a physicochemical transformed data series, expressed in sequential distance units of amino acids. These peptides are then selected for physiological testing on the target protein system. The peptide design methods and an associated mechanistic rationale are illustrated for the methods of the present invention, using an eigenvector template derived from the hydrophobic free energy-transformed sequence of several different receptors and random assignment of amino acids to the eigenvector templates based on probability-weighted amino acid pool distributions. The peptides generated in this manner demonstrate physiological activity in receptor-transfected cell systems, as shown by direct action and/or pretreatment potentiation or inhibition of extracellular acidification rates. In addition, peptides generated by the methods of the present invention also bound to and otherwise interact with and alter the activities of the seven-transmembrane cholinergic M1 receptor (xe2x80x9cmuscarinic M1 receptorxe2x80x9d) and the nerve growth factor (NGF) receptor, which has one transmembrane segment. As another example of the range of applicability of these methods, hydrophobic free energy mode matches between the peptide fibroblastic growth factor and ribonuclease successfully predicted their functional interaction in neuroendocrine cell culture. These results illustrate the broad applicability of the methods of the present invention to the design of peptides for binding to or otherwise modulating a wide variety of different kinds of target polypeptides and proteins.
One of the three mode-matched peptide design methods of the invention involves the construction of such peptides using random assignment of peptide constituents, such as amino acids, as dictated by an eigenvector template containing polypeptide-matching physicochemical property binding/modulating modes. This method is herein exemplified by one of many possible physicochemical properties usable in the method, namely, hydrophobic free energy. The template eigenvector is obtained by linear decomposition of an autocovariance matrix formed by transformation of the polypeptide""s amino acid sequence into a physicochemical sequence, in this case a hydrophobic free energy data series. The leading eigenvalue-associated eigenvectors are convolved with the original hydrophobic free energy data series to construct eigenfunctions. These eigenfunctions may then be further analyzed using wavelet transformations and all poles, maximum entropy power spectral transformations. The wavelet transformations may be discrete or continuous, and further may be one-dimensional wavelet packets or multiple convolved wavelet transformations. This approach yields clean representations of the polypeptide hydrophobic free energy modes as leading and secondary eigenfunctions. Most of the information found in the secondary eigenfunctions would be lost in the conventional smoothing of hydropathy plots, or contaminated by end effects and multimodality in conventional Fourier transformations. The eigenvectors associated with these eigenfunctions are used as templates for the formation of mode-matched peptides that can be tested for their ability to bind to or otherwise modulate the receptor. A mode match is attained when the maximum entropy power spectral or wavelet transformations of the polypeptide and the peptide or peptide-like molecule yield wavenumbers that differ by 10% or less of the larger wavenumber value. The amino acids intended for use in producing the candidate peptide are grouped into a number of groups, based on their assigned values of a physicochemical property (e.g. hydrophobic free energy). The eigenvector associated with the eigenfunction (or, alternately, the eigenvectors-based vector) is graphed, where the x-axis shows ordered position of the eigenvector and the Y-axis shows the numerical values of the physicochemical property. The y-axis is partitioned into an equal number of groups as intervals of the y-axis (e.g., four equal intervals), converting the eigenvector (or eigenvectors-based vector) into an eigenvector template. Amino acids corresponding to the value of the physicochemical property on the y-axis of the eigenvector template are randomly assigned to positions in the template, forming peptides or peptide-like molecules. The amino acid assignments may also be weighted or otherwise altered in accordance with a specific amino acid pool distribution or in accordance with known effects of substitutions of individual amino acids or amino acid segments, if desired.
The second method involves the construction of mode-matched peptides through the generation of wavelet subsequence templates derived from a variety of wavelet transformations of the physicochemically-transformed amino acid sequence of the target protein. The wavelet transformation method is particularly well suited for the study of localized coherent structures that appear across a target protein sequence, such as the patterns of alternating helices, loops and strands that make up larger supersecondary structures, such as helical barrels and sheets. A number of mother wavelet families are available for use in wavelet transformations.
The third method produces redundant target polypeptide or protein subsequence templates from the physicochemically-transformed amino acid sequence of the target polypeptide or protein. Redundant subsequence templates are prepared by converting the amino acid sequence of the target polypeptide or protein into a template through symbolic representations of each amino acid, e.g., one-letter amino acid codes or, more preferably, values representing each amino acid""s membership in a particular physicochemical property grouping. The transformed target polypeptide or protein sequence is then scanned to find all possible redundant nonoverlapping subsequences. The redundant subsequences detected are used as templates to create mode-matching peptides.
It is therefore an object of the present invention to provide a method for synthesizing a peptide or a peptide-like molecule based on matching a physicochemical mode of a target polypeptide or protein to the same physicochemical mode of the peptide or peptide-like molecule, comprising the steps of assigning a numerical value of an orderable physicochemical property to each member of a set of peptide constituents which includes all the members of the set of naturally-occurring amino acids, arranging the peptide constituents in order of the numerical values of an orderable physicochemical property, partitioning the set of peptide constituents into a plurality of peptide constituent groups, whereby each of the peptide constituent groups contains at least one member of the set of peptide constituents, each peptide constituent group encompasses a range of the numerical values, each member of the set of peptide constituent belongs to only one peptide constituent group, creating a polypeptide physicochemical data series by replacing each amino acid in an amino acid sequence of the target polypeptide or protein with the numerical value of the orderable physicochemical property corresponding to each amino acid in the amino acid sequence, calculating one or more polypeptide eigenvalues and a corresponding polypeptide eigenvector associated with each of the polypeptide eigenvalues by linear decomposition of an autocovariance matrix formed from a sequentially lagged data matrix of the polypeptide physicochemical data series, ordering the polypeptide eigenvalues and the corresponding polypeptide eigenvectors from largest to smallest, selecting one or more of the polypeptide eigenvectors, transforming the selected polypeptide eigenvectors into an eigenvector template, forming a graph of the eigenvector template, wherein the numerical values of the physicochemical property are graphed along the y-axis of the graph and ordered position in the eigenvector template is graphed along the x-axis of the graph, partitioning the graph along the y-axis according to the ranges of the numerical values of the physicochemical property defining the peptide constituent groups to form a plurality of y-axis ranges, assigning a member of the peptide constituent group to each position in the peptide or peptide-like molecule by using the graph as a template, wherein at each ordered position in the eigenvector template along the x-axis of the graph, the member of the peptide constituent group assigned to the ordered position has a value of the orderable physicochemical property that is within the y-axis range of the ordered point, and synthesizing the peptide or peptide-like molecule.
It is another object of the present invention to provide a method for matching a physicochemical mode of a peptide or a peptide-like molecule to the same physicochemical mode of a target polypeptide or protein to determine if the peptide will bind to and/or otherwise modulate the target polypeptide or protein, comprising the steps of assigning a numerical value of an orderable physicochemical-property to each member of a set of peptide constituents which includes all the members of the set of naturally-occurring amino acids, arranging the peptide constituents in order of the numerical values of the orderable physicochemical property, partitioning the set of peptide constituents into a plurality of peptide constituent groups, whereby each of the peptide constituent groups contains at least one member of the set of peptide constituents, each peptide constituent group encompasses a range of the numerical values, each member of the set of peptide constituents belongs to only one peptide constituent group, creating a polypeptide physicochemical data series by replacing each amino acid in an amino acid sequence of the target polypeptide or protein with the numerical value of the orderable physicochemical property corresponding to each amino acid in the amino acid sequence, calculating one or more polypeptide eigenvalues and a corresponding polypeptide eigenvector associated with each of the polypeptide eigenvalues by linear decomposition of an autocovariance matrix formed from a sequentially lagged data matrix of the polypeptide physicochemical data series, ordering the polypeptide eigenvalues and the corresponding polypeptide eigenvectors from largest to smallest, transforming the polypeptide physicochemical data series into one or more polypeptide eigenfunctions, using the ordered polypeptide eigenvectors as multiplicative weights, transforming the polypeptide eigenfunctions into dominant wavenumbers, using all poles maximum entropy power spectra, to produce polypeptide spectral power peaks, identifying the polypeptide power spectral peaks, creating a peptide physicochemical data series by replacing each peptide constituent in a peptide sequence of the peptide or a peptide-like molecule with the numerical value of the orderable physicochemical property corresponding to the peptide constituent in the peptide sequence, calculating one or more peptide eigenvalues and a corresponding peptide eigenvector associated with each of the peptide eigenvalues by linear decomposition of an autocovariance matrix formed from the peptide physicochemical data series, ordering the peptide eigenvalues and the corresponding eigenvectors from largest to smallest, transforming the peptide physicochemical data series into one or more peptide eigenfunctions, using the ordered peptide eigenvectors as multiplicative weights, transforming the peptide eigenfunctions into dominant wavenumbers, using all poles maximum entropy power spectra, to produce peptide spectral power peaks, identifying the peptide power spectral peaks, and comparing the polypeptide spectral power peaks to the peptide spectral power peaks to determine if the polypeptide spectral power peaks match the peptide spectral power peaks, wherein a match between the polypeptide spectral power peaks and the peptide spectral power peaks indicates the peptide or peptide-like molecule may bind to and/or otherwise modulate the target polypeptide or protein.
It is another object of the present invention to provide a method for matching a peptide or a peptide-like molecule to a target polypeptide or protein to determine if the peptide will bind to and/or otherwise modulate the target polypeptide or protein, comprising the steps of assigning a numerical value of an orderable physicochemical property to each member of a set of peptide constituents, the set of peptide constituents including all the members of the set of naturally-occurring amino acids, arranging the peptide constituents in order of the numerical values of the orderable physicochemical property, partitioning the set of peptide constituents into a plurality of peptide constituent groups, whereby each of the peptide constituent groups contains at least one member of the set of peptide constituents, each peptide constituent group encompasses a range of the numerical values, each member of the set of peptide constituents belongs to only one peptide constituent group, creating a polypeptide physicochemical data series by replacing each amino acid in an amino acid sequence of the target polypeptide or protein with the numerical value corresponding to the amino acid in the amino acid sequence, decomposing the polypeptide physicochemical data series into translated and scaled version of a mother wavelet, w, as             W      R        ⁢          (              a        ,        b            )        =            (              1        /                  a                    )        ⁢                  ∫        0        i            ⁢                        H          ⁢                      (            i            )                          ⁢                  w          ⁢                      (                                          i                -                b                            a                        )                          ⁢                  xe2x80x83                ⁢                  ⅆ          i                    
wherein w denotes the chosen mother wavelet function, separating WR(a,b) into polypeptide modulus and polypeptide phase parts, graphing the polypeptide phase parts on a polypeptide phase graph, wherein the x-axis of the polypeptide phase graph indexes sequence position and the y-axis of the polypeptide phase graph is numbered in units of one of dilate divisions (dd) and wavelet wavelengths ({overscore (xcfx89)}), graphing the polypeptide modulus parts on a polypeptide modulus graph, wherein the x-axis of the polypeptide modulus graph indexes sequence position and the y-axis of the polypeptide modulus graph is numbered in units of one of dilate divisions (dd) and wavelet wavelengths ({overscore (xcfx89)}), identifying a plurality of polypeptide maximal phase amplitudes and a plurality of polypeptide moduli in the polypeptide phase graph and the polypeptide modulus graph, respectively, creating a peptide physicochemical data series by replacing each peptide constituent in a peptide sequence of the peptide or a peptide-like molecule with the numerical value of the orderable physicochemical property corresponding to each the peptide constituent in the peptide sequence, decomposing the peptide physicochemical data series into translated and scaled version of a mother wavelet, w, as             W      L        ⁢          (              a        ,        b            )        =            (              1        /                  a                    )        ⁢                  ∫        0        i            ⁢                        H          ⁢                      (            i            )                          ⁢                  w          ⁢                      (                                          i                -                b                            a                        )                          ⁢                  xe2x80x83                ⁢                  ⅆ          i                    
wherein w denotes the chosen mother wavelet function, separating WL(a,b) into peptide modulus and peptide phase parts, graphing the peptide phase parts on a peptide phase graph, wherein the x-axis of the peptide phase graph indexes sequence position and the y-axis of the peptide phase graph is numbered in units of one of relative dilation (dd) and wavelet wavelengths ({overscore (xcfx89)}), graphing the peptide modulus parts on a peptide modulus graph, wherein the x-axis of the peptide modulus graph indexes sequence position and the y-axis of the peptide modulus graph is numbered in units of one of dilate divisions (dd) and wavelet wavelengths ({overscore (xcfx89)}), identifying a plurality of peptide maximal phase amplitudes and a plurality of peptide moduli in the peptide phase graph and the peptide modulus graph, respectively, comparing the plurality of polypeptide maximal phase amplitudes in the polypeptide phase graph to the plurality of peptide maximal phase amplitudes in the peptide phase graph to determine if the plurality of polypeptide maximal phase amplitudes match the plurality of peptide maximal phase amplitudes, comparing the plurality of polypeptide moduli in the polypeptide modulus graph to the plurality of peptide moduli in the peptide modulus graph to determine if the plurality of polypeptide moduli match the plurality of peptide moduli, wherein a match between the plurality of polypeptide maximal phase amplitudes and the plurality of peptide maximal phase amplitudes, and a match between the plurality of polypeptide moduli and the plurality of peptide moduli, indicates the peptide or peptide-like molecule may bind to and/or otherwise modulate the polypeptide.
It is another object of the present invention to provide a method for matching a peptide or a peptide-like molecule to a target polypeptide or protein to determine if the peptide will bind to and/or otherwise modulate the target polypeptide or protein, comprising the steps of assigning a numerical value of an orderable physicochemical property to each member of a set of peptide constituents, the set of peptide constituents including all the members of the set of naturally-occurring amino acids, arranging the peptide constituents in order of the numerical values of the orderable physicochemical property, partitioning the set of peptide constituents into a plurality of peptide constituent groups, whereby each of the peptide constituent groups contains at least one member of the set of peptide constituents, each group encompasses a range of the numerical values, each member of the set of peptide constituents belongs to only one peptide constituent group, creating a polypeptide physicochemical data series by replacing each amino acid in an amino acid sequence of the target polypeptide or protein with the numerical value corresponding to the amino acid in the amino acid sequence, decomposing the polypeptide physicochemical data series with a family of functions Wj,n,k(x)=2xe2x88x92j/2Wn(2xe2x88x92jxxe2x88x92k), which when j,n are positive integers and k has an integer value, are organized in one or more tree structures, each of the tree structures being composed of a plurality of nodes, each of the nodes being in the form of: 
wherein Wj,n,k(x) is computed for a mother wavelet function, computing and frequency ordering best level and best tree representations of a physicochemical polypeptide series based on Stein""s Unbiased Risk Estimate (SURE) and Shannon entropy criteria, graphing the best level representation on a polypeptide best level graph, wherein the x-axis of the polypeptide best level graph indexes sequence position and the y-axis of the polypeptide best level graph is numbered in units of wavelet wavelengths, {overscore (xcfx89)}, graphing the best tree representation on a polypeptide best tree graph, wherein the x-axis of the polypeptide best tree graph indexes sequence position and the y-axis of the polypeptide best tree graph is numbered in units of one of relative dilation (dd) and wavelet wavelengths, {overscore (xcfx89)}, identifying a plurality of polypeptide maximal coefficient amplitudes, each of the plurality of polypeptide maximal coefficient amplitudes being derived from the polypeptide best level graph and the polypeptide best tree graph, creating a peptide physicochemical data series by replacing each peptide constituent in a peptide sequence of the peptide or a peptide-like molecule with the numerical value of the orderable physicochemical property corresponding to the peptide constituent in the peptide sequence, decomposing the peptide physicochemical data series with the family of functions Wj,n,k(x)=2xe2x88x92j/2Wn(2xe2x88x92jxxe2x88x92k), which when j,n are positive integers and k has an integer value, are organized in one or more tree structures, each of the tree structures being comprised of a plurality of nodes, each of the nodes being in the form of 
wherein Wj,n,k(x) is computed for a mother wavelet function, computing and frequency ordering best level and best tree representations of a physicochemical peptide series based on SURE and Shannon entropy criteria, graphing the best level representation on a peptide best level graph, wherein the x-axis of the peptide best level graph indexes sequence position and the y-axis of the peptide best level graph is numbered in units of wavelet wavelengths, {overscore (xcfx89)}, graphing the best tree representation on a peptide best tree graph, wherein the x-axis of the peptide best tree graph indexes sequence position and the y-axis of the peptide best tree graph is numbered in units of one of relative dilation (dd) and wavelet wavelengths, {overscore (xcfx89)}, identifying a plurality of peptide maximal coefficient amplitudes, each of the plurality of peptide maximal coefficient amplitudes being derived from the peptide best level graph and the peptide best tree graph, comparing the plurality of polypeptide maximal coefficient amplitudes to the plurality of peptide maximal coefficient amplitudes to determine if the plurality of polypeptide maximal coefficient amplitudes match the plurality of peptide maximal coefficient amplitudes, wherein a match between the plurality of polypeptide maximal coefficient amplitudes and the plurality of peptide maximal coefficient amplitudes indicates the peptide or peptide-like molecule may bind to and/or otherwise modulate the target polypeptide or protein.
It is another object to provide a method for modifying a non-peptide-responsive target polypeptide or protein to bind to and/or otherwise modulate a peptide or peptide-like molecule by modifying the sequence of the non-peptide-responsive target polypeptide or protein to match a physicochemical mode of the peptide or peptide-like molecule, comprising the steps of assigning a numerical value of an orderable physicochemical property to each member of a set of polypeptide constituents, the set of peptide constituents including all the members of the set of naturally-occurring amino acids, arranging the peptide constituents in order of the numerical values of the orderable physicochemical property, partitioning the set of peptide constituents into a plurality of peptide constituent groups, whereby each of the peptide constituent groups contains at least one member of the set of peptide constituents, each group encompasses a range of the numerical values, each member of the set of peptide constituents belongs to only one peptide constituent group, creating a polypeptide physicochemical data series by replacing each amino acid in an amino acid sequence of the non-peptide-responsive target polypeptide or protein with the numerical value corresponding to the amino acid in the amino acid sequence, calculating one or more polypeptide eigenvalues and a corresponding polypeptide eigenvector associated with each of the polypeptide eigenvalues by linear decomposition of an autocovariance matrix formed from the polypeptide physicochemical data series, ordering the polypeptide eigenvalues and the corresponding polypeptide eigenvectors from largest to smallest, transforming the polypeptide physicochemical data series into polypeptide eigenfunctions, using the ordered polypeptide eigenvectors as multiplicative weights, transforming the polypeptide eigenfunctions into dominant wavenumbers, using all poles maximum entropy power spectra to produce polypeptide spectral power peaks, identifying the polypeptide power spectral peaks, creating a peptide physicochemical data series by replacing each peptide constituent in a peptide sequence of the peptide or peptide-like molecule with a numerical value of the orderable physicochemical property corresponding to the peptide or peptide-like molecule constituent in the peptide sequence, calculating one or more peptide eigenvalties and a corresponding peptide eigenvector associated with each of the peptide eigenvalues by linear decomposition of an autocovariance matrix formed from the peptide physicochemical data series, ordering the peptide eigenvalues and the corresponding peptide eigenvectors from largest to smallest, transforming the peptide physicochemical data series into peptide eigenfunctions, using the peptide eigenvectors as multiplicative weights, transforming the peptide eigenfunctions into dominant wavenumbers, using all poles maximum entropy power spectra, to produce peptide spectral power peaks, identifying the peptide power spectral peaks, comparing the polypeptide spectral power peaks to the peptide spectral power peaks to determine if the polypeptide spectral power peaks match the peptide spectral power peaks, wherein a match between the polypeptide spectral power peaks and the peptide spectral power peaks indicates the peptide or peptide-like molecule may bind to and/or otherwise modulate the non-peptide-responsive target polypeptide or protein, and if the polypeptide spectral power peaks do not match the peptide spectral power peaks, modifying the amino acid sequence of the non-peptide-responsive target polypeptide or protein to form a match between the polypeptide spectral power peaks and the peptide spectral power peaks.
It is a further object to provide a method for modifying a non-peptide-responsive target polypeptide or protein to bind to and/or otherwise modulate a peptide or peptide-like molecule by modifying the sequence of the non-peptide-binding/modulating target polypeptide to match a physicochemical mode of the peptide or peptide-like molecule, comprising the steps of assigning a numerical value of an orderable physicochemical property to each member of a set of peptide constituents, the set of peptide constituents including all the members of the set of naturally-occurring amino acids, arranging the peptide constituents in order of the numerical values of the orderable physicochemical property, partitioning the set of peptide constituents into a plurality of peptide constituent groups, whereby each of the peptide constituent groups contains one or more members of the set of peptide constituents, each group encompasses a range of said numerical values, each member of the set of peptide constituents belongs to only one peptide constituent group, creating a polypeptide physicochemical data series by replacing each amino acid in an amino acid sequence of the non-peptide-binding and/or modulating target polypeptide or protein with a numerical value corresponding to each the amino acid in the amino acid sequence, decomposing the polypeptide physicochemical data series into translated and scaled version of a mother wavelet, w, as             W      R        ⁢          (              a        ,        b            )        =            (              1        /                  a                    )        ⁢                  ∫        0        i            ⁢                        H          ⁢                      (            i            )                          ⁢                  w          ⁢                      (                                          i                -                b                            a                        )                          ⁢                  xe2x80x83                ⁢                  ⅆ          i                    
wherein w denotes the chosen mother wavelet function, separating WR(a,b) into polypeptide modulus and polypeptide phase parts, graphing the polypeptide phase parts on a polypeptide phase graph, wherein the x-axis of the polypeptide phase graph indexes sequence position and the y-axis of the polypeptide phase graph is numbered in units of one of relative dilation (dd) and wavelet wavelengths ({overscore (xcfx89)}), graphing the polypeptide modulus parts on a polypeptide modulus graph, wherein the x-axis of the polypeptide modulus graph indexes sequence position and the y-axis of the polypeptide modulus graph is numbered in units of one of relative dilation (dd) and wavelet wavelengths ({overscore (xcfx89)}), identifying a plurality of polypeptide maximal phase amplitudes and a plurality of polypeptide moduli in the polypeptide phase graph and the polypeptide modulus graph, respectively, creating a peptide physicochemical data series by replacing each peptide constituent in a peptide sequence of a peptide or a peptide-like molecule with the numerical value corresponding to each peptide constituent in the peptide sequence, decomposing the peptide physicochemical data series into translated and scaled version of a mother wavelet, w, as             W      L        ⁢          (              a        ,        b            )        =            (              1        /                  a                    )        ⁢                  ∫        0        i            ⁢                        H          ⁢                      (            i            )                          ⁢                  w          ⁢                      (                                          i                -                b                            a                        )                          ⁢                  xe2x80x83                ⁢                  ⅆ          i                    
wherein w denotes the chosen mother wavelet function, separating WL(a,b) into peptide modulus and peptide phase parts, graphing the peptide phase parts on a peptide phase graph, wherein the x-axis of the peptide phase graph indexes sequence position and the y-axis of the peptide phase graph is numbered in units of one of relative dilation (dd) and wavelet wavelengths ({overscore (xcfx89)}), graphing the peptide modulus parts on a peptide modulus graph, wherein the x-axis of the peptide modulus graph indexes sequence position and the y-axis of the peptide modulus graph is numbered in units of one of relative dilation (dd) and wavelet wavelengths ({overscore (xcfx89)}), identifying a plurality of peptide maximal phase amplitudes and a plurality of peptide moduli in each of the peptide phase graph and the peptide modulus graph, respectively, comparing the plurality of polypeptide maximal phase amplitudes in the polypeptide phase graph to the plurality of peptide maximal phase amplitudes in the peptide phase graph respectively to determine if the plurality of polypeptide maximal phase amplitudes match the plurality of peptide maximal phase amplitudes, comparing the plurality of polypeptide moduli in the polypeptide modulus graph to the plurality of peptide moduli in the peptide modulus graph to determine if the plurality of polypeptide moduli match the plurality of peptide moduli, wherein a match between the-plurality of polypeptide maximal phase amplitudes and the plurality of peptide maximal phase amplitudes, and a match between the plurality of polypeptide moduli and the plurality of peptide moduli indicates the peptide or peptide-like molecule may bind to and/or otherwise modulate the non-peptide-binding and/or modulating target polypeptide or protein, and if the plurality of polypeptide maximal phase amplitudes do not match the plurality of peptide maximal phase amplitudes, or if the plurality of polypeptide moduli do not match the plurality of peptide moduli, modifying the amino acid sequence of the non-peptide-binding and/or modulating target polypeptide or protein to form a match between the plurality of polypeptide maximal phase amplitudes and the plurality of peptide maximal phase amplitudes, and between the polypeptide moduli and the peptide moduli.
It is a further object to provide a method for modifying a non-peptide-responsive target polypeptide or protein to bind to and/or otherwise modulate a peptide or peptide-like molecule by modifying the sequence of the non-peptide-responsive target polypeptide or protein to match a physicochemical mode of the peptide or peptide-like molecule, comprising the steps of assigning a numerical value of an orderable physicochemical property to each member of a set of peptide constituents, the set of peptide constituents including all the members of the set of naturally-occurring amino acids, arranging the peptide constituents in order of the numerical values of the orderable physicochemical property, partitioning the set of peptide constituents into a plurality of peptide constituent groups, whereby each of the peptide constituent groups contains one or more members of the set of peptide constituents, each group encompassing a range of said numerical values, each member of the set of peptide constituents belongs to only one peptide constituent group, creating a polypeptide physicochemical data series by replacing each amino acid in an amino acid sequence of the non-peptide-binding and/or modulating target polypeptide or protein with the numerical value of the orderable physicochemical property corresponding to the amino acid in the amino acid sequence, decomposing the polypeptide physicochemical data series with a family of functions Wj,n,k(x)=2xe2x88x92j/2Wn(2xe2x88x92jxxe2x88x92k), which when j,n are positive integers and k has an integer value, are organized in one or more tree structures, each of the tree structures being comprised of a plurality of nodes, each of the nodes being in the form of: 
wherein the Wj,k,n(x) is computed for a mother wavelet function, computing and frequency ordering best level and best tree representations of the physicochemical polypeptide series based on SURE and Shannon entropy criteria, graphing the best level representation on a polypeptide best level graph, wherein the x-axis of the polypeptide best level graph indexes sequence position and the y-axis of the polypeptide best level graph is numbered in units of wavelet wavelengths, {overscore (xcfx89)}, graphing the best tree representation on a polypeptide best tree graph, wherein the x-axis of the polypeptide best tree graph indexes sequence position and the y-axis of the polypeptide best tree graph is numbered in units of one of relative dilation (dd) and wavelet wavelengths, {overscore (xcfx89)}, identifying a plurality of polypeptide maximal coefficient amplitudes, each of the plurality of polypeptide maximal coefficient amplitudes being derived from the polypeptide best level and best tree graphs, decomposing the peptide physicochemical data series with a family of functions Wj,n,k(x)=2xe2x88x92j/2Wn(2xe2x88x92jxxe2x88x92k), which when j,n are positive integers and k has an integer value, are organized in one or more tree structures, each of the tree structures being comprised of a plurality of nodes, each of the nodes being in the form of: 
wherein the Wj,n,k(x) is computed a mother wavelet function, computing and frequency ordering best level and best tree representations of the physicochemical peptide series based on SURE and Shannon entropy criteria, graphing the best level representation on a peptide best level graph, wherein the x-axis of the peptide best level graph indexes sequence position and the y-axis of the peptide best level graph is numbered in units of wavelet wavelengths, {overscore (xcfx89)}, graphing the best tree representation on a peptide best tree graph, wherein the x-axis of the peptide best tree graph indexes sequence position and the y-axis of the best tree graph is numbered in units of one of relative dilation (dd) and wavelet wavelengths, {overscore (xcfx89)}, identifying a plurality of peptide maximal coefficient amplitudes, each of the plurality of peptide maximal coefficient amplitudes being derived from the peptide best level and best tree graphs, comparing the plurality of polypeptide moduli in the polypeptide modulus graph to the plurality of peptide moduli in the peptide modulus graph to determine if the plurality of polypeptide moduli match the plurality of peptide moduli, wherein a match between the plurality of polypeptide maximal phase amplitudes and the plurality of peptide maximal phase amplitudes, and a match between the plurality of polypeptide moduli and the plurality of peptide moduli indicates the peptide or peptide-like molecule may bind to and/or otherwise modulate the non-peptide-binding and/or modulating target polypeptide or protein, and if the plurality of polypeptide maximal phase amplitudes do not match the plurality of peptide maximal phase amplitudes, or if the plurality of polypeptide moduli do not match the plurality of peptide moduli, modifying the amino acid sequence of the non-peptide-binding and/or modulating target polypeptide or protein to form a match between the plurality of polypeptide maximal phase amplitudes and the plurality of peptide maximal phase amplitudes, and between the polypeptide moduli and the peptide moduli.
The present invention also provides a method of detecting a cancerous cell or tissue, comprising contacting all or a portion of the cancerous cell or tissue with an effective amount of a peptide or peptide-like molecule having a physicochemical mode that matches a physicochemical mode of a target polypeptide or protein found on the cancerous cell or tissue.
The present invention also provides a method of detecting a tumor in a patient, comprising administering to the patient an effective amount of a peptide or peptide-like molecule having a physicochemical mode that matches a physicochemical mode of a polypeptide or protein found on the tumor, and detecting binding and/or modulating of the peptide or peptide-like molecule to the polypeptide or protein.
The present invention also provides a pharmaceutical composition for treatment of a tumor, comprising a peptide or peptide-like molecule having a physicochemical mode that matches a physicochemical mode of a polypeptide or protein found on the tumor, and a pharmaceutically acceptable carrier.
The present invention also provides a diagnostic kit for use in detecting a polypeptide or protein, comprising a container having a peptide or peptide-like molecule, the peptide or peptide-like molecule having a physicochemical mode that matches a physicochemical mode of the polypeptide or protein.
The present invention also provides a method for screening for a disease condition, comprising contacting a sample obtained from a patient with an effective amount of a peptide or peptide-like molecule having a physicochemical mode that matches a physicochemical mode of a polypeptide or protein found in the sample, wherein the presence, absence or abnormality in the polypeptide or protein is diagnostic of the presence of the disease condition.
The present invention also provides a method for screening a member selected from the group consisting of water, food, and soil for the presence of a contaminant, comprising contacting the member with a peptide or peptide-like molecule having a physicochemical mode that matches a physicochemical mode of a polypeptide or protein found in the member, wherein the presence, absence, or abnormality in the polypeptide or protein is diagnostic of the presence of the contaminant.
The present invention also provides a method for treating a disease condition, comprising administering to a patient in need of such treatment a peptide or peptide-like molecule having a physicochemical mode that matches a physicochemical mode of a polypeptide or protein found in the sample, wherein the peptide or peptide-like molecule is capable of effecting a direct action and/or modulation of an activity of the polypeptide or protein, and the direct action and/or modulation effected by the peptide or peptide-like molecule is associated with a change in the disease condition.
The present invention also provides a method for detecting an interaction between a peptide and a target polypeptide or protein, comprising incubating a peptide prepared by at least one of the methods of the present invention with the target polypeptide or protein under conditions that promote the interaction of the peptide with the target polypeptide or protein, and detecting the interaction of the peptide with the target polypeptide or protein.
The present invention also provides a pharmaceutical composition for treatment of a disease condition, comprising a peptide or peptide-like molecule having a physicochemical mode that matches a physicochemical mode of a polypeptide or protein found in the sample, the peptide or peptide-like molecule being capable of effecting a direct action and/or modulation of an activity of the polypeptide or protein, and the direct action and/or modulation effected by the peptide or peptide-like molecule is associated with a change in the disease condition, and a pharmaceutically acceptable carrier.
The above and other objects, features and advantages of the present invention will become apparent from the following description read in conjunction with the accompanying drawings. dr
FIG. 1 is a flowchart which summarizes the methods of the present invention.
FIG. 2A (left) is a graph of the hydrophobic free energy series, Hi, of the human D2DA receptor and (right) its broad band, multimodal all poles, maximum entropy power spectral transformation S(xcfx89).
FIG. 2B (left) is a graph of the human D2DA receptor""s dominant eigenfunction, xcexa81, demonstrating the ≈7 peaks characteristic of the leading receptor eigenfunction of members of the seven-transmembrane receptor superfamily and (right) the associated long wavelength peak ( greater than 50 residues) in the S(xcfx89).
FIG. 2C (left) is a graph of the human D2DA receptor""s secondary eigenfunction, xcexa82, and (right) its associated peaks in the S(xcfx89) at wavelengths of 8.12 and 2.61 residues.
FIG. 2D (left) is a graph of the human D2DA receptor""s secondary eigenvector, X2, used in the design of new peptides, and (right) its associated peaks in the S(xcfx89) at wavelengths of 8.16 and 2.67 residues.
FIG. 3A is a graph of the wavelet subspace transformation of the Hi of the D2DA receptor, wherein {overscore (xcfx89)}=f(dd)≅2.3 residues. Sequence position is graphed along the x-axis and phase amplitude along the y-axis.
FIG. 3B is a graph of the wavelet subspace transformation of the Hi of the D2DA receptor, wherein {overscore (xcfx89)}=f(dd)≅8.1 residues. Sequence position is graphed along the x-axis and phase amplitude along the y-axis.
FIG. 4A is a graph showing the effects of the SHQR peptide (SEQ ID NO:1) on the EAR responses of the human D2DA-transfected mouse LtK cell system to dopamine infusion. DA=control with dopamine alone.
FIG. 4B is a graph showing the effects of the THQA (SEQ ID NO:2) peptide on the EAR responses of the human D2DA-transfected mouse LtK cell system to dopamine infusion. DA=control with dopamine alone.
FIG. 4C is a graph showing the effects of the SHQR (SEQ ID NO:1) peptide on the EAR responses of the human D2DA-transfected mouse CHO cell system to dopamine infusion. DA=control with dopamine alone.
FIG. 4D is a graph showing the effects of the THQA (SEQ ID NO:2) peptide on the EAR responses of the human D2DA-transfected mouse CHO-cell system to dopamine infusion. DA=control with dopamine alone.
FIG. 5A is a graph showing the effects of the E . . . PL (SEQ ID NO:3) peptide on the EAR responses of the human D2DA-transfected mouse LtK cell system to dopamine infusion. DA=control with dopamine alone.
FIG. 5B is a graph showing the effects of the E . . . PY (SEQ ID NO:4) peptide on the EAR responses of the human D2DA-transfected mouse LtK cell system to dopamine infusion. DA=control with dopamine alone.
FIG. 5C is a graph showing the effects of the E . . . PL (SEQ ID NO:3) peptide on the EAR responses of the human D2DA-transfected mouse CHO cell system to dopamine infusion. DA=control with dopamine alone.
FIG. 5D is a graph showing the effects of the E . . . PY peptide (SEQ ID NO:4) on the EAR responses of the human D2DA-transfected mouse CHO cell system to dopamine infusion. DA=control with dopamine alone.
FIG. 6A is a graph showing the effects of the M1 receptor-derived peptide ITFT (SEQ ID NO:9) on the EAR responses of the human M1 receptor-transfected CHO cell system to carbachol infusion left, control with carbachol alone, right, carbachol plus ITFT peptide.
FIG. 6B is a graph showing the effects of the M1 receptor-derived peptide FSFQ (SEQ ID NO:7) on the EAR responses of the human M1 receptor-transfected CHO cell system to carbachol infusion left, control with carbachol alone, right, carbachol plus FSFQ peptide.