The present invention relates to locating protein epitopes and more particularly to novel methods for identifying, determining the location, and the optimal length of immunobiologically active amino acid sequences.
Epitopes or antigenic determinants of a protein antigen represent the sites that are recognized as binding sites by certain immune components such as antibodies or immunocompetent cells. While epitopes are defined only in a functional sense i.e. by their ability to bind to antibodies or immunocompetent cells, it is usually accepted that there is a structural basis for their immunological reactivity.
Epitopes are classified as either being continuous and discontinuous (Atassi and Smith, 1978, Immunochemisty, vol 15 p. 609). Discontinuous epitopes are composed of sequences of amino acids throughout an antigen and rely on the tertiary structure or folding of the protein to bring the sequences together and form the epitope. In contrast, continuous epitopes are linear peptide fragments of the antigen that are able to bind to antibodies raised against the intact antigen.
Many antigens have been studied as possible serum markers for different types of cancer because the serum concentration of the specific antigen may be an indication of the cancer stage in an untreated person. As such, it would be very advantageous to develop immunological reagents that react with the antigen, and more specifically, with the epitopes of the protein antigen.
To date, methods using physical-chemical scales have attempted to determine the location of probable peptide epitopes which includes looking at the primary structure, that being the amino acid sequence, secondary structure such as turns, helices, and even the folding of the protein in the tertiary structure. Continuous epitopes are structurally less complicated and therefore may be easier to locate, however, the ability to predict the location, length and potency of the site is limited.
Various methods have been used to identify and predict the location of continuous epitopes in proteins by analyzing certain features of their primary structure. For example, parameters such as hydrophilicity, accessibility, and mobility of short segments of polypeptide chains have been correlated with the location of epitopes (see Pellequer et al. 1991, Method in Enzyology, vol 203, p. 176-201).
Hydrophilicity, has been used as the basis for determining protein epitopes by analyzing an amino acid sequence in order to find the point of greatest local hydrophilicity as disclosed in U.S. Pat. No. 4,554, 101. Hopp and Woods (See Proc. Natl. Acad. Sci. USA, vol. 78, No. 6, pp. 3824-3828, Jun. 1981) have shown that by assigning each amino acid a relative hydrophilicity numerical value and then averaging local hydrophilicity so that the location of the highest local average hydrophilicity values represent the locations of the continuous epitopes. However, this method does not provide any information as to the optimal length of the continuous epitope.
Likewise, the amino acid sequence of a protein as measured by the Kyte-Doolittle (Kyte and Doolittle, 1982, J. Mol. Biol. vol. 72, p. 105) scale, is commonly used to evaluate the hydrophilic and hydrophobic tendencies of polypeptide chains by using a hydropathy scale. Each amino acid in the polypeptide chain is assigned a value reflecting its relative hydrophilicity and hydrophobicity which are averaged across a moving section of the sequence. This method offers a graphic visualization of the hydropathic character of the amino acid chain. It is theorized that by using the hydropathic character of the sequence, interior sequence regions which are usually composed of hydrophobic amino acids can be distinguished from hydrophilic exterior sequence regions. This information offers the ability to evaluate the possible secondary structure. However this model, does not predict the optimal length of the epitope or indicate if the effective size of epitopes is unique for each protein molecule.
Accordingly, what is needed is a simple method to identify immunobiologically-active peptide epitopes, determine their optimal length, and locations of these epitopes within a polypeptide.
In accordance with this invention there is provided methods for identifying immunobiologically-active linear peptide epitopes of a protein antigen and determining the optimal length of amino acid residues of the epitope.
TERMS
For purposes of this invention, the terms and expressions below, appearing in the specification and claims, are intended to have the following meanings:
xe2x80x9cWindowxe2x80x9d as used herein means the number of amino acid residues in a curve segment.
xe2x80x9cLaggingxe2x80x9d as used herein means to move across the entire amino acid residues sequence increasing by one (1) in each step.
xe2x80x9cPeriod numberxe2x80x9d as used herein means the number of amino acids assigned as the period between xe2x88x92180xc2x0 to +180xc2x0 in the negative cosine function plot.
xe2x80x9cFit-Correlation Valuexe2x80x9d as used herein means a numerical value which is indicative of the fit between the hydropathy plot curve and a negative cosine function wherein the value may be positive or negative depending on the fit. The better the fit the more positive the value.
xe2x80x9cEpitopexe2x80x9d as used herein means the portion of an antigen that binds specifically with the binding site of an antibody or a receptor on a lymphocyte.
xe2x80x9cPotential Ho-Hi-Ho epitopexe2x80x9d as used herein means an epitope wherein the curve segment of the hydrophilicity plot correlates with the negative cosine function giving a fit-correlation value.
xe2x80x9cPotential Ho-Hi-Ho epitope setxe2x80x9d as used herein means a set of epitopes having a positive fit-correlation value for a specific period assigned to the negative cosine curve.
xe2x80x9cHo-Hi-Ho theoretical epitopesxe2x80x9d as used herein means the epitopes in the potential epitope set that have ranking values that exhibit the most oscillating behavior about an equilibrium position and either converge towards or diverge away from this equilibrium position and are deemed the most immunobiologically-active linear peptides.
xe2x80x9cNumber Rangexe2x80x9d as used herein means the numerated amino acid sequence number region of the amino acid sequence having a length equal to a period number, i.e. if the period is 10, then the sequence number ranges could be 1-10, 2-11, 3-12 and so on until (nxe2x88x92(mxe2x88x921)) where n is equal to the number of amino acid residues in the entire polypeptide and m is the period number.
Immune responses arise as a result of exposure to foreign stimuli. The compound that evokes the response is referred to as antigen or as immunogen. An immunogen is any agent capable of inducing an immune response. In contrast, an antigen is any agent capable of binding specifically to components of the immune response, such as lymphocytes and antibodies. The smallest unit of an antigen that is capable of binding with various immune components, either cells, such as T and B lymphocytes, or antibodies, is called an epitope. Compounds may have one or more epitopes capable of reacting with immune components. The methods of the present inventions provide an in silica methodology for determining the antigen-binding site of an antibody or a receptor on a lymphocyte that has a unique structure that allows a complementary xe2x80x9cfitxe2x80x9d to some structural aspect of the specific antigen.
Thus understood, a primary object of the present invention is to provide a method for determining immunobiologically-active linear peptide epitopes and their optimal length.
Another object of the present invention is to identify immunobiologically-active linear peptide epitopes without the need for time consuming and expensive testing regimes to determine immunogenic activity, such as in vivo animal testing and/or in vitro assay testing.
A further object of this invention is to determine the immunopotency of an epitope and provide a ranking system delineating between dominant and subdominant epitopes.
A still further object of the invention is to provide monoclonal and polyclonal antibodies highly specific for the peptide epitopes of the present invention which may be utilized in detecting procedures to determine the presence of an antigen in a sample.
Yet another object of the present invention is to provide for synthetic peptides from a protein having the specific amino acid sequence and length determined by the methods herein that may be used in an immunization regime wherein the synthetic peptides are recognized by the body""s immune system and induce production of immune components such as antibodies and/or immunocompetent cells, i.e. B and T cells that will react with the peptide or the entire protein.
Another object of the present invention is to provide a method to determine the optimal length of a peptide that binds to antibodies and/or immunocompetent cells.
Still another object is to provide for nucleic acid molecules encoding for the immunobiologically-active linear peptide epitopes having an optimal length found by the methods disclosed herein.
The foregoing objects are achieved by fitting a hydrophilicity and/or hydrophobicity plot generated for the amino acid linear sequence of a polypeptide to a mathematically generated continuous curve which has at least a maximum positive value thereby generating potential epitope sets which include ranked potential epitopes which contain a specific number of amino acid residues. These sets of ranked potential epitopes may be used to determine immunobiologically-active linear peptides by comparison methods, such as a comparison between the sets to determine the set exhibiting the greatest amount of oscillating behavior about an equilibrium position; comparing the ranked potential epitopes with other epitopes generated by propensity scales; comparing with a previously generated plot such as hydrophilicity, accessibility, hydrophobicity and the like; and/or combinations thereof. Preferably, the set of potential epitopes that exhibit the most alternating positioning about an equilibrium position when juxtaposed on the hydrophilicity and/or hydrophobicity plot are deemed the immunobiologically-active epitopes. Their optimal length corresponds to the specific number of amino acid residues in the set of ranked potential epitopes.
This invention relates to an improved method for determining the optimal length of an immunobiologically active epitope that does not require either in vivo animal testing or in vitro immunoassay testing regimes. Unexpectedly it has been discovered by this inventor that an alternating rhythmic pattern in the ranked potential epitopes provides the necessary information to determine the optimal length.
The method for determining the optimal length of an immunobiologically-active linear peptide epitope comprises the following steps:
a) providing a curve characterizing the hydrophilicity and/or hydrophobicity of the linear sequence of amino acid residues of a polypeptide;
b) generating at least one potential epitope set comprising at least one potential epitope by fitting a window of the curve of step (a) to a mathematically generated continuous curve, the continuous curve having repeating values at regular intervals with at least a maximum positive value, the window containing a specific number of amino acid residues and the window is lagged through the curve of step (a);
c) increasing the number of residues in the window after each lagging;
d) determining and ranking potential epitopes for each set by selecting potential epitopes having a positive-fit correlation value determined by fitting curves in step (b) thereby providing a set of ranked potential epitopes for each window of residues used in step (b), the most positive-fit correlation value ranked first in each potential epitope set;
e) examining the positioning of at least the highest ranked potential epitopes of each set relative to the plot of step (a) to determine at least one set of potential epitopes that exhibit alternating positioning about an equilibrium position wherein the ranking values of the potential epitopes converge towards or diverge away from the equilibrium position; and
f) designating the potential epitopes of the set having the most alternating ranking values that converge or diverge as the immunologically active epitopes which have an optimal length equating to numeric value of amino acid residues in the potential epitopes.
Preferably, the potential epitopes are generated by fitting a hydrophilicity curve generated by plotting hydropathy values according to the prediction method of Kyte-Doolittle and correlating this curve to a negative cosine function thereby generating Ho-Hi-Ho theoretical epitopes.
The method of the present invention may be used to determine the length of a contiguous amino acid sequence of a polypeptide characterized by a hydrophobic-hydrophilic-hydrophobic motif the method comprising the steps of:
a) assigning an average hydropathy value to each amino acid of the polypeptide;
b) generating a hydrophilicity plot using the average hydropathy value of each amino acid;
c) fitting a curve segment of the hydrophilicity plot to a negative cosine function, wherein a specific period number value of the negative cosine function equates to the number of amino acids in the curve segment, the period number increasing within a predetermined chosen period number range after each sequential lagging through the hydrophilicity plot thereby providing fit-correlation values for each curve segment across the linear sequence when using the specific period number value;
d) generating a potential Ho-Hi-Ho epitope set for each specific period number value within the chosen period number range, wherein each potential Ho-Hi-Ho epitope set contains potential Ho-Hi-Ho epitopes that have a fit-correlation value;
e) ranking each potential Ho-Hi-Ho epitope in the potential Ho-Hi-Ho epitope set according to positive fit-correlation values wherein the epitope having highest positive fit correlation value is ranked number one thereby providing ranked Ho-Hi-Ho potential epitopes for each specific period number value;
f) examining the positioning of at least the highest ranked Ho-Hi-Ho potential epitopes of each set relative to the linear sequence of the generated plot in step (a) to determine at least one set of Ho-Hi-Ho potential epitopes that exhibits alternating positioning about an equilibrium position wherein the ranking values of the Ho-Hi-Ho potential epitopes converge towards or diverge away from the equilibrium position; and
g) designating the Ho-Hi-Ho potential epitopes of the set having the most alternating ranking values that converge or diverge as the immunologically active epitopes which have an optimal length equating to numeric value of amino acid residues in the potential epitopes.
The present invention further provides for a Ho-Hi-Ho epitope of contiguous amino acid residues from a polypeptide wherein the Ho-Hi-Ho epitope is defined by a motif of two hydrophobic and one hydrophilic regions arranged in the following manner
hydrophobicxe2x88x92hydrophilicxe2x88x92hydrophobic
and characterized by an approximated xe2x88x92180xc2x0 to +180xc2x0 negative cosine hydrophilicity pattern wherein said Ho-Hi-Ho epitope peptide has an optimal length of amino acid residues from about 3 to about 250. The optimal length of amino acid residues is determined by the methods of the present invention.
Also provided is an antisera specific for a Ho-Hi-Ho epitope of contiguous amino acid residues from a polypeptide wherein the Ho-Hi-Ho epitope is characterized by a hydrophobic-hydrophilic-hydrophobic motif and an approximated xe2x88x92180xc2x0 to +180xc2x0 negative cosine hydrophilicity pattern having an optimal length of amino acid residues from about 3 about 250. Additionally, the optimal length may be determined by the method disclosed in the present invention.
There is also provided an antigenic composition comprising a Ho-Hi-Ho epitope of contiguous amino acid residues from a polypeptide wherein the Ho-Hi-Ho epitope is characterized by a hydrophobic-hydrophilic-hydrophobic motif and an approximated xe2x88x92180xc2x0 to +180xc2x0 negative cosine hydrophilicity pattern having an optimal length of amino acid residues from about 3 to about 250.
Additionally, the optimal length may be determined by the method disclosed in the present invention.
Still further provided is a diagnostic testing method comprising the steps of:
(i) providing a sample;
(ii) contacting the sample with antisera specific for a Ho-Hi-Ho epitope of contiguous amino acid residues from a polypeptide wherein the Ho-Hi-Ho epitope is characterized by a hydrophobic-hydrophilic-hydrophobic motif having an optimal length of amino acid residues from about 3 to about 250 determined by the methods of the present invention; and
(iii) detecting binding of the antisera to a polypeptide in the sample.
Also provided is a diagnostic testing method comprising the steps of:
(i) providing an antisera sample
(ii) contacting said antisera sample with at least one Ho-Hi-Ho epitope having an optimal length determined by the present methods; and
(iii) detecting the binding said Ho-Hi-Ho epitope to said antisera sample.
Alternatively, the above diagnostic testing method may include a tissue sample which may be contacted with at least one Ho-Hi-Ho epitope.
The present invention also provides for isolated nucleic acid molecules that encode for the Ho-Hi-Ho immunobiologically active epitope having an optimal length determined by the methods of the present invention. The nucleic acid molecule may include; a cDNA molecule comprising the nucleotide sequence of the coding region of the epitope, isolated DNA or RNA molecule or a genetic variant thereof which encodes the immunobiologically active epitope.