The present invention provides novel polynucleotides and proteins encoded by such polynucleotides, along with uses for these polynucleotides and proteins, for example in therapeutic, diagnostic and research methods. In particular, the invention relates to a novel human stem cell growth factor-like protein.
Technology aimed at the discovery of protein factors (including e.g., cytokines, such as lymphokines, interferons, CSFs, chemokines, and interleukins) has matured rapidly over the past decade. The now routine hybridization cloning and expression cloning techniques clone novel polynucleotides xe2x80x9cdirectlyxe2x80x9d in the sense that they rely on information directly related to the discovered protein (i.e., partial DNA/amino acid sequence of the protein in the case of hybridization cloning; activity of the protein in the case of expression cloning). More recent xe2x80x9cindirectxe2x80x9d cloning techniques such as signal sequence cloning, which isolates DNA sequences based on the presence of a now well-recognized secretory leader sequence motif, as well as various PCR-based or low stringency hybridization-based cloning techniques, have advanced the state of the art by making available large numbers of DNA/amino acid sequences for proteins that are known to have biological activity, for example, by virtue of their secreted nature in the case of leader sequence cloning, by virtue of their cell or tissue source in the case of PCR-based techniques, or by virtue of structural similarity to other genes of known biological activity.
Identified polynucleotide and polypeptide sequences have numerous applications in, for example, diagnostics, forensics, gene mapping; identification of mutations responsible for genetic disorders or other traits, to assess biodiversity, and to produce many other types of data and products dependent on DNA and amino acid sequences.
The compositions of the present invention include novel isolated polypeptides, novel isolated polynucleotides encoding such polypeptides, including recombinant DNA molecules, cloned genes or degenerate variants thereof, especially naturally occurring variants such as allelic variants, antisense polynucleotide molecules, and antibodies that specifically recognize one or more epitopes present on such polypeptides, as well as hybridomas producing such antibodies. Specifically, the polynucleotides of the present invention are based on polynucleotides isolated from cDNA libraries prepared from human fetal liver spleen (Hyseq clone identification number 6118092), ovary (Hyseq clone identification number 8375786), adult brain (Hyseq clone identification numbers 701734, 15327638, 15741682, 15954941, 15955015), lung tumor (Hyseq clone identification number 11047146 and 10280328), spinal cord (Hyseq clone identification number 10102150), cervix (Hyseq clone identification numbers 10022437 and 14029194), ovary (Hyseq clone identification number 8319153), endothelial cells (Hyseq clone identification number 13815744), umbilical cord (Hyseq clone identification number 18568149), lymphocyte (Hyseq clone identification number 10257378), lung fibroblast (Hyseq clone identification number 17116257), fetal brain (Hyseq clone identification number 15266959), and testis.
Using Hyseq""s sequencing by hybridization signature analysis, very closely related polynucleotides are expected to be isolated from human fetal liver-spleen (Hyseq clone identification numbers 6118092, 6118141, 324694, 139790, 388618), stomach (Hyseq clone identification number 11423449), endothelial cells (Hyseq clone identification numbers 13773559, 13815744, 13841093), adult brain (Hyseq clone identification numbers 737767, 701734, 16127344, 15198141, 15208858, 15554838, 15946615, 15296366, 15321434, 15741682, 15841267, 15855073, 15726537, 15955015, 15327638, 15954941, 16344372), bone marrow (Hyseq clone identification numbers 114762120625288, 20798194, 16463779), adult kidney (Hyseq clone identification numbers 2405528 and 2305428), adult spleen (Hyseq clone identification numbers 2972973, 2956887, 14377989, 14476605, 14417776, 14541649), ovary (Hyseq clone identification numbers 7634122, 8319153, 8494602, 8265358, 8375786), lung tumor (Hyseq clone identification numbers 11047146, 7760706, 7774431, 9236436, 10280328, 11000820), leukocytes (Hyseq clone identification numbers 2251685 and 2357232), adult lung (Hyseq clone identification number 3394875), adrenal gland (Hyseq clone identification number 14066103), fetal lung (Hyseq clone identification numbers 15521916 and 11902971), thyroid gland (Hyseq clone identification number 10080227), fetal skin (Hyseq clone identification numbers 17941214, 18028270, 18060622, 18189205, 20576265), small intestine (Hyseq clone identification numbers 18431269 and 18356960), fetal muscle (Hyseq clone identification number 20887519), fetal kidney (Hyseq clone identification number 21990692), spinal cord (Hyseq clone identification numbers 9923443 and 10102150), thymus (Hyseq clone identification number 14992102), fetal brain (Hyseq clone identification number 15266959), cervix (Hyseq clone identification numbers 14029194, 14244274, 10022437), fetal heart (Hyseq clone identification number 21913716), umbilical cord (Hyseq clone identification number 18568149), lymphocyte (Hyseq clone identification number 10257378), lung fibroblast (Hyseq clone identification number 17116257).
The compositions of the present invention additionally include vectors, including expression vectors, containing the polynucleotides of the invention, cells genetically engineered to contain such polynucleotides and cells genetically engineered to express such polynucleotides.
The isolated polynucleotides of the invention include, but are not limited to, a polynucleotide comprising any one of the nucleotide sequences set forth in the SEQ ID NO: 1-22 and SEQ ID NO: 24; a polynucleotide comprising any of the full length protein coding sequences of the SEQ ID NO: 1-22 and SEQ ID NO: 24; and a polynucleotide comprising any of the nucleotide sequences of the mature protein coding sequences of the SEQ ID NO: 1-22 and SEQ ID NO: 24. The polynucleotides of the present invention also include, but are not limited to, a polynucleotide that hybridizes under stringent hybridization conditions to (a) the complement of any one of the nucleotide sequences set forth in the SEQ ID NO: 1-22 and SEQ ID NO: 24; (b) a nucleotide sequence encoding any one of SEQ ID NO: 23 or 25 or the amino acid sequences set forth in Table A; a polynucleotide which is an allelic variant of any polynucleotides recited above; a polynucleotide which encodes a species homolog (e.g. orthologs) of any of the proteins recited above; or a polynucleotide that encodes a polypeptide comprising a specific domain or truncation of any of the polypeptides comprising SEQ ID NO: 23 or 25 or set forth in Table A.
The nucleic acid sequences of the present invention also include the sequence information from the nucleic acid sequences of SEQ ID NO: 1-22 and SEQ ID NO: 24. The sequence information can be a segment of any one of SEQ ID NO: 1-22 and SEQ ID NO: 24 that uniquely identifies or represents the sequence information of SEQ ID NO: 1-22 and SEQ ID NO: 24. One such segment can be a twenty-mer nucleic acid sequence because the probability that a twenty-mer is fully matched in the human genome is 1 in 300. In the human genome, there are three billion base pairs in one set of chromosomes. Because 420 possible twenty-mers exist, there are 300 times more twenty-mers than there are base pairs in a set of human chromosome. Using the same analysis, the probability for a seventeen-mer to be fully matched in the human genome is approximately 1 in 5. When these segments are used in arrays for expression studies, fifteen-mer segment can be used. The probability that the fifteen-mer is fully matched in the expressed sequences is also approximately one in five because expressed sequences in one tissue comprise approximately 5% of the entire genome sequence.
Similarly, when using a sequence information for detecting a single mismatch, a segment can be a twenty-five mer. The probability that the twenty-five mer would appear in a human genome with a single mismatch is calculated by multiplying the probability for a fill match (1÷425) times the increased probability for mismatch at each nucleotide position (3xc3x9725). The probability that an eighteen mer with a single mismatch can be detected in an array for expression studies is approximately one in five. The probability that a twenty-mer with a single mismatch can be detected in a human genome is approximately one in five.
A collection as used in this application can be a collection of only one polynucleotide. The collection of sequence information or unique identifying information of each sequence can be provided on a nucleic acid array. In one embodiment, segments of sequence information is provided on a nucleic acid array to detect the polynucleotide that contains the segment. The array can be designed to detect full-match or mismatch to the polynucleotide that contains the segment. The collection can also be provided in a computer-readable format.
This invention also includes the reverse or direct complement of any of the nucleic acid sequences recited above; cloning or expression vectors containing the nucleic acid sequences; and host cells or organisms transformed with these expression vectors.
One stem cell growth factor-like polypeptide (SEQ ID NO: 23) is approximately a 392-amino acid protein with a predicted molecular mass of approximately 44 kDa unglycosylated. SEQ ID NO: 23 is encoded by SEQ ID NO: 24. FIG. 1 shows the alignment of polynucleotide SEQ ID NO: 24 and EST sequences SEQ ID NO: 1-21. SEQ ID NO: 25 is also expected to have a transmembrane portion at approximately LHAGLIVGILILVLFVATAILVTVYMYH (amino acid residues 315 to 342 of SEQ ID NO: 25 or SEQ ID NO: 23). The sequences of the present invention (SEQ ID NO: 1-25 and as set forth in Table A) are expected to have stem cell growth factor activity, including hematopoetic stem cell growth factor activity, as described herein. Other uses of the polypeptides and polynucleotides of the present invention are also contemplated and are fully described below.
SEQ ID NO: 24 is a complement of SEQ ID NO: 22. The polypeptides of the present invention also include the six frame translation of SEQ ID NO: 24 as set forth below in Table A, where A=Alanine, C=Cysteine, D=Aspartic Acid, E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, *=stop codon.
We prefer forward read Frame 2, and most prefer SEQ ID NO: 23 and 25.
Stem cell growth factor-like protein and/or fragments or derivatives would have similar activity to stem cell growth factors and anabolic growth factors and receptors.
The isolated polypeptides of the invention include, but are not limited to, a polypeptide comprising SEQ ID NO: 23 and 25 or those set forth in Table A; or the corresponding full length or mature protein. Polypeptides of the invention also include polypeptides with biological activity that are encoded by (a) any of the polynucleotides having a nucleotide sequence set forth in the SEQ ID NO: 1-22 and SEQ ID NO: 24; or (b) polynucleotides that hybridize to the complement of the polynucleotides of (a) under stringent hybridization conditions. Biologically or immunologically active variants of any of the protein sequences listed as SEQ ID NO: 23 and 25 and in Table A, and xe2x80x9csubstantial equivalentsxe2x80x9d thereof (e.g., with at least about 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% amino acid sequence identity) that preferably retain biological activity are also contemplated. The polypeptides of the invention may be wholly or partially chemically synthesized but are preferably produced by recombinant means using the genetically engineered cells (e.g. host cells) of the invention.
The invention also provides compositions comprising a polypeptide of the invention. Polypeptide compositions of the invention may further comprise an acceptable carrier, such as a hydrophilic, e.g., pharmaceutically acceptable, carrier.
The invention also provides host cells transformed or transfected with a polynucleotide of the invention.
The invention also relates to methods for producing a polypeptide of the invention comprising growing a culture of the host cells of the invention in a suitable culture medium under conditions permitting expression of the desired polypeptide, and purifying the protein from the culture or from the host cells. Preferred embodiments include those in which the protein produced by such process is a mature form of the protein.
Polynucleotides according to the invention have numerous aspplications in a variety of techniques known to those skilled in the art of molecular biology. These techniques include use as hybridization probes, use as oligomers, or primers, for PCR, use in an array, use in computer-readable media, use for chromosome and gene mapping, use in the recombinant production of protein, and use in generation of anti-sense DNA or RNA, their chemical analogs and the like. For example, when the expression of an mRNA is largely restricted to a particular cell or tissue type, polynucleotides of the invention can be used as hybridization probes to detect the presence of the particular cell or tissue mRNA in a sample using, e.g., in situ hybridization
In other exemplary embodiments, the polynucleotides are used in diagnostics as expressed sequence tags for identifying expressed genes or, as well known in the art and exemplified by Vollrath et al., Science 258:52-59 (1992), as expressed sequence tags for physical mapping of the human genome.
A polynucleotide according to the invention can be joined to any of a variety of other nucleotide sequences by well-established recombinant DNA techniques (see Sambrook, J., et al. (1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, N.Y.). Useful nucleotide sequences for joining to polypeptides include an assortment of vectors, e.g., plasmids, cosmids, lambda phage derivatives, phagemids, and the like, that are well known in the art. Accordingly, the invention also provides a vector including a polynucleotide of the invention and a host cell containing the polynucleotide. In general, the vector contains an origin of replication functional in at least one organism, convenient restriction endonuclease sites, and a selectable marker for the host cell. Vectors according to the invention include expression vectors, replication vectors, probe generation vectors, and sequencing vectors. A host cell according to the invention can be a prokaryotic or eukaryotic cell and can be a unicellular organism or part of a multicellular organism.
The polypeptides according to the invention can be used in a variety of conventional procedures and methods that are currently applied to other proteins. For example, a polypeptide of the invention can be used to generate an antibody that specifically binds the polypeptide. Such antibodies, particularly monoclonal antibodies, are useful for detecting or quantitating the polypeptide in tissue. The polypeptides of the invention can also be used as molecular weight markers, and as a food supplement.
Methods are also provided for preventing, treating, or ameliorating a medical condition which comprises the step of administering to a mammalian subject a therapeutically effective amount of a composition comprising a protein of the present invention and a pharmaceutically acceptable carrier.
In particular, the polypeptides and polynucleotides of the invention can be utilized, for example, as part of methods for the prevention and/or treatment of disorders involving aberrant protein expression or biological activity.
The methods of the invention also provides methods for the treatment of disorders as recited herein which may involve the administration of the polynucleotides or polypeptides of the invention to individuals exhibiting symptoms or tendencies related to disorders as recited herein. In addition, the invention encompasses methods for treating diseases or disorders as recited herein comprising the step of administering compounds and other substances that modulate the overall activity of the target gene products. Compounds and other substances can effect such modulation either on the level of target gene/protein expression or target protein activity. Specifically, methods are provided for preventing, treating or ameliorating a medical condition, including neurological diseases, which comprises administering to a mammalian subject, including but not limited to humans, a therapeutically effective amount of a composition comprising a polypeptide of the invention or a therapeutically effective amount of a composition comprising a binding partner of (e.g., antibody specifically reactive for) stem cell growth factor-like polypeptides of the invention. The mechanics of the particular condition or pathology will dictate whether the polypeptides of the invention or binding partners (or inhibitors) of these would be beneficial to the individual in need of treatment.
The invention also provides a method of promoting wound healing comprising administering a stem cell growth factor-like polypeptide of the present invention to the site of a wound or injury. The invention provides a method of promoting cell growth and morphogenesis comprising administering a stem cell growth factor-like polypeptide of the present invention to a medium of nerve cells. According to this method, polypeptides of the invention can be administered to produce an in vitro or in vivo promotion of cellular function. A polypeptide of the invention can be administered in vivo as a stem cell growth factor alone or as an adjunct to other therapies.
The invention further provides methods for manufacturing medicaments useful in the above described methods.
The present invention further relates to methods for detecting the presence of the polynucleotides or polypeptides of the invention in a sample (e.g., tissue or sample). Such methods can, for example, be utilized as part of prognostic and diagnostic evaluation of disorders as recited herein and for the identification of subjects exhibiting a predisposition to such conditions. The invention also provides kits comprising polynucleotide probes and/or monoclonal. antibodies, and optionally quantitative standards, for carrying out methods of the invention. Furthermore, the invention provides methods for evaluating the efficacy of drugs, and monitoring the progress of patients, involved in clinical trials for the treatment of disorders as recited above.
The invention also provides methods for the identification of compounds that modulate (i.e., increase or decrease) the expression or activity of the polynucleotides and/or polypeptides of the invention. Such methods can be utilized, for example, for the identification of compounds that can ameliorate symptoms of disorders as recited herein. Such methods can include, but are not limited to, assays for identifying compounds and other substances that interact with (e.g., bind to) the polypeptides of the invention.