This invention relates to the field of neurobiology. In particular this invention pertains to the identification of a number of novel glutamate transporters.
Excitatory neurotransmission involves the exocytotic release of synaptic vesicles filled with glutamate. Glutamate is synthesized in the cytoplasm, and undergoes transport into synaptic vesicles for quantal release. Like the uptake of other classical transmitters, vesicular glutamate transport depends on a proton electrochemical gradient (xcex94xcexcH+) generated by the vacuolar H+-ATPase (Disbrow et al. (1982) Biochemical and Biophysical Res. Commun., 108: 1221-1227; Naito and Ueda (1983) J. Biol. Chem. 258: 696-6990). However, unlike the uptake of monoamines and acetylcholine, vesicular glutamate transport relies predominantly on the electrical component of this gradient (xcex94xcexa8) rather than the chemical component (xcex94pH) (Carlson et al. (1989) J. Biol. Chem. 264: 7369-7376; Maycox et al. (1988) J. Biol. Chem. 263: 15423-15428). Consistent with this different mechanism, the two protein families responsible for vesicular uptake of monoamines, ACh and xcex3-aminobutyric acid (GABA) (Liu and Edwards (1997) Ann. Rev. Neurosci. 20: 125-156; Reimer et al. (1998) Curr. Opin. Neurobiol. 8: 405-412; Schuldiner et al. (1995) Physiol. Rev. 75, 369-392; Varoqui et al., (1994) FEBS Lett. 342: 97-102) have not been found to include a glutamate transporter.
This invention pertains to the identification of a family of novel glutamate transporters. In particular, certain brain-specific Na+-dependent phosphate transporter are shown to be glutamate transporters. Designated herein as VGLUT glutamate transporters, members of this family include, but are not limited to VGLUT1 (formerly BNPI), VGLUT2 (formerly DNPI), and VGLUT3.
The VGLUT transporters of this invention provide good targets to screen for agents that modulate (e.g. upregulate or downregulate) glutamate uptake by a cell (e.g. by a neuron). Thus, in one embodiment, this invention provides a method of screening for an agent that modulates the uptake of glutamate into a cell (e.g. into a synaptic vesicle). The method preferably involves contacting a cell comprising a nucleic acid selected from the group consisting of VGLUT1, VGLUT2, and VGLUT3 with a test agent; and detecting expression or activity of VGLUT1, VGLUT2, or VGLUT3 where an increase or decrease in the expression or activity of VGLUT1, VGLUT2, or VGLUT3 as compared to a control indicates that the test agent modulates the uptake of glutamate into a cell. The control can be a positive or a negative control. In certain embodiments, the control is a negative control comprising contacting a cell at a lower concentration or in the absence concentration of the test agent. Preferred cells include somatic cells or oocytes. Particularly preferred cells include vertebrate cells, more preferably mammalian (e.g. human, rabbit, mouse, goat, equine, porcine, feline, canine, etc.) cells.
In certain preferred embodiments, the detecting comprises detecting a VGLUT (e.g. VGLUT1 and/or VGLUT2, and/or VGLUT3, etc.) nucleic acid and/or a (VGLUT) polyeptide (e.g. VGLUT1 polypeptide and/or VGLUT2 polypeptide, and/or VGLUT3 polypeptide, etc.) VGLUT1 polypeptide, a VGLUT2 polypeptide, or a VGLUT3 polypeptide. In certain embodiments, the VGLUT nucleic acid is detected via a nucleic acid hybridization (e.g., a Northern blot, a Southern blot using DNA derived from the VGLUT1, VGLUT2, or VGLUT3 mRNA, an array hybridization, an affinity chromatography, an in situ hybridization, etc.) and/or a nucleic acid amplification (e.g. PCR, LCR, etc.).
In preferred embodiments, the VGLUT polypeptide is detected via a method such as capillary electrophoresis, Western blot, mass spectroscopy, ELISA, immunochromatography, immunohistochemistry, thin layer chromatography (TLC), and the like. In preferred embodiments, the VGLUT polypeptide activity involves detecting glutamate transport in a cell expressing an endogenous or a heterologous VGLUT polypeptide (e.g., VGLUT1, VGLUT2, VGLUT3, etc.). In certain embodiments, the test agent is not one or more of the following: an antibody, a nucleic acid, a protein, and an agent that alters xcex94pH or xcex94xcexa8. In particularly embodiments the test agent is a small organic molecule. In certain embodiments, the methods further comprise comparing the level of expression or activity of VGLUT1 with the level of expression or activity of VGLUT2 and/or VGLUT3.
In another embodiment, this invention provides a method of prescreening for a potential modulator of glutamate transporter activity (e.g. glutamate uptake into a synaptic vesicle). The method preferably involves contacting a VGLUT glutamate transporter polypeptide (e.g. VGLUT1, VGLUT2, VGLUT3, etc.) or a nucleic acid encoding a VGLUT glutamate transporter polypeptide with a test agent; and detecting binding (e.g. specific binding) of the test agent to the VGLUT glutamate transporter polypeptide or to the nucleic acid encoding a VGLUT glutamate transporter polypeptide where specific binding of said test agent to the VGLUT glutamate transporter polypeptide or VGLUT nucleic acid indicates that the test agent is a potential modulator of glutamate transporter activity. The method can, optionally, further involve recording test agents that specifically bind to the VGLUT glutamate transporter polypeptide or to the nucleic acid encoding a VGLUT glutamate transporter polypeptide in a database of candidate modulators of glutamate transporter activity. In certain embodiments, the test agent is not one or more of the following: an antibody, a nucleic acid, a protein, and an agent that alters xcex94pH or xcex94xcexa8. In particularly embodiments the test agent is a small organic molecule. The detecting can involve detecting specific binding of the test agent to the VGLUT nucleic acid (e.g. via Northern blot, a Southern blot using DNA derived from the VGLUT mRNA, array hybridization, affinity chromatography, in situ hybridization, etc.). The detecting can also involve detecting specific binding of the test agent to the VGLUT glutamate transporter polypeptide (e.g. via capillary electrophoresis, Western blot, mass spectroscopy, ELISA, immunochromatography, thin layer chromatography, and immunohistochemistry). In certain embodiments, the test agent is contacted directly to the VGLUT glutamate transporter polypeptide or to the nucleic acid encoding a VGLUT glutamate transporter polypeptide. in certain embodiments, the test agent is contacted to a cell containing the VGLUT glutamate transporter polypeptide or to said nucleic acid encoding a VGLUT glutamate transporter polypeptide. The cell can be a cell cultured ex vivo.
In still another embodiment, this invention provides a cell comprising a heterologous nucleic acid encoding a glutamate transporter wherein said glutamate transporter is selected from the group consisting of VGLUT1, VGLUT2, and VGLUT3. Preferred cells include somatic cells (e.g. nerve cells), or oocytes. Particularly preferred cells include vertebrate cells, more preferably mammalian (e.g. human, rabbit, mouse, goat, equine, porcine, feline, canine, etc.) cells. In a particularly preferred embodiment, the cell transports glutamate via the heterologous VGLUT glutamate transporter. In one embodiment, the cell is a pheochromocytoma PC12 cell.
In yet another embodiment, this invention provides a method of increasing glutamate transport by a mammalian cell. The method can involve transfecting the cell with a nucleic acid encoding a VGLUT polypeptide selected from the group consisting of VGLUT1, VGLUT2, and VGLUT3. The VGLUT nucleic acid is preferably operably linked to a constitutive, tissue-specific or inducible promoter.
This invention also provides a method of decreasing glutamate uptake into a cell. The method involves downregulating expression or activity of a VGLUT polypeptide in the cell. In certain embodiments, the inhibiting comprises a method selected from the group consisting of contacting a VGLUT nucleic acid with a ribozyme that specifically cleaves said VGLUT nucleic acid, contacting a VGLUT nucleic acid with a catalytic DNA that specifically cleaves said VGLUT nucleic acid, transfecting a cell comprising an VGLUT gene with a nucleic acid that inactivates the VGLUT gene by homologous recombination with the VGLUT gene, transfecting a cell comprising a with a nucleic acid encoding an intrabody that specifically binds a VGLUT polypeptide, and transfecting the cell with a VGLUT antisense molecule.
This invention also provides a kit for screening for compounds that modulate glutamate transport. Preferred kits include a cell that expresses a VGLUT glutamate transporter selected from the group consisting of VGLUT1, VGLUT2, and VGLUT3; and a detection moiety selected from the group consisting of an antibody that specifically binds to the VGLUT glutamate transporter, a nucleic acid that specifically binds to a nucleic acid encoding the VGLUT glutamate transporter, a primer that specifically amplifies a nucleic acid encoding said VGLUT glutamate transporter or a fragment thereof, and a labeled glutamate. The cell is preferably a cell comprising a heterologous nucleic acid encoding the glutamate transporter. The kit can also include instructional materials providing protocols for screening for modulators of a VGLUT glutamate transporter and teaching that such modulators alter glutamate transport.
This invention also provides VGLUT knockout animals. Preferred knockouts include a mammal (e.g., an equine, a bovine, a rodent, a porcine, a lagomorph, a feline, a canine, a murine, a caprine, an ovine, a non-human primate, etc.) comprising a disruption in an endogenous glutamate transporter gene selected from the group consisting of VGLUT1, VGLUT2, and VGLUT3, where the disruption results in the knockout mammal exhibiting decreased expression of a VGLUT glutamate transporter as compared to a wild-type animal. In certain embodiments, the disruption an insertion, a deletion, a frameshift mutation, a substitution, or a stop codon. In certain embodiments, the disruption comprises an insertion of an expression cassette into said endogenous glutamate transporter gene. The expression cassette can comprise a selectable marker. The expression cassette can comprise a neomycin phosphotransferase gene operably linked to at least one regulatory element. The disruption can be in a somatic cell and/or in a germ cell. The mammal can be heterozygous or homozygous for the disrupted glutamate transporter gene.
This invention also provides a method of inhibiting glutamate uptake into a cell. The method can comprise contacting a cell comprising a synaptic vesicle with an agent that inhibits expression or activity of a VGLUT polypeptide. In certain embodiments, the agent is not one or more of the following: an antibody, a nucleic acid, a protein, and an agent that alters xcex94pH or xcex94xcexa8. In particularly embodiments the test agent is a small organic molecule, a VGLUT antisense molecule, a VGLUT ribozyme, a VGLUT catalytic DNA, an anti-VGLUT antibody, and a nucleic acid that disrupts a VGLUT gene by homologous recombination.
Also provided is a method of increasing glutamate uptake into a cell where the method comprises contacting the cell comprising with an agent that increases VGLUT glutamate transporter expression or activity (e.g. an vector encoding a heterologous VGLUT glutamate transporter).
The terms xe2x80x9cpolypeptidexe2x80x9d, xe2x80x9cpeptidexe2x80x9d and xe2x80x9cproteinxe2x80x9d are used interchangeably herein to refer to a polymer of amino acid residues. The terms apply to amino acid polymers in which one or more amino acid residue is an artificial chemical analogue of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers. The term also includes variants on the traditional peptide linkage joining the amino acids making up the polypeptide.
The terms xe2x80x9cnucleic acidxe2x80x9d or xe2x80x9coligonucleotidexe2x80x9d or grammatical equivalents herein refer to at least two nucleotides covalently linked together. A nucleic acid of the present invention is preferably single-stranded or double stranded and will generally contain phosphodiester bonds, although in some cases, as outlined below, nucleic acid analogs are included that may have alternate backbones, comprising, for example, phosphoramide (Beaucage et al. (1993) Tetrahedron 49(10):1925) and references therein; Letsinger (1970) J. Org. Chem. 35:3800; Sprinzl et al. (1977) Eur. J. Biochem. 81: 579; Letsinger et al. (1986) Nucl. Acids Res. 14: 3487; Sawai et al. (1984) Chem. Lett. 805, Letsinger et al. (1988) J. Am. Chem. Soc. 110: 4470; and Pauwels et al. (1986) Chemica Scripta 26: 1419), phosphorothioate (Mag et al. (1991) Nucleic Acids Res. 19:1437; and U.S. Pat. No. 5,644,048), phosphorodithioate (Briu et al. (1989) J. Am. Chem. Soc. 111 :2321, O-methylphophoroamidite linkages (see Eckstein, Oligonucleotides and Analogues: A Practical Approach, Oxford University Press), and peptide nucleic acid backbones and linkages (see Egholm (1992) J. Am. Chem. Soc. 114:1895; Meier et al. (1992) Chem. Int. Ed. Engl. 31: 1008; Nielsen (1993) Nature, 365: 566; Carlsson et al. (1996) Nature 380: 207). Other analog nucleic acids include those with positive backbones (Denpcy et al. (1995) Proc. Natl. Acad. Sci. USA 92: 6097; non-ionic backbones (U.S. Pat. Nos. 5,386,023, 5,637,684, 5,602,240, 5,216,141 and 4,469,863; Angew (1991) Chem. Intl. Ed. English 30: 423; Letsinger et al. (1988) J. Am. Chem. Soc. 110:4470; Letsinger et al. (1994) Nucleoside and Nucleotide 13:1597; Chapters 2 and 3, ASC Symposium Series 580, xe2x80x9cCarbohydrate Modifications in Antisense Researchxe2x80x9d, Ed. Sanghui and Cook; Mesmaeker et al. (1994), Bioorganic and Medicinal Chem. Lett. 4: 395; Jeffs et al. (1994) J. Biomolecular NMR 34:17; Tetrahedron Lett. 37:743 (1996)) and non-ribose backbones, including those described in U.S. Pat. Nos. 5,235,033 and 5,034,506, and Chapters 6 and 7, ASC Symposium Series 580, Carbohydrate Modifications in Antisense Research, Ed. Sanghui and Cook. Nucleic acids containing one or more carbocyclic sugars are also included within the definition of nucleic acids (see Jenkins et al. (1995), Chem. Soc. Rev. pp169-176). Several nucleic acid analogs are described in Rawls, C and E News Jun. 2, 1997 page 35. These modifications of the ribose-phosphate backbone may be done to facilitate the addition of additional moieties such as labels, or to increase the stability and half-life of such molecules in physiological environments.
A xe2x80x9cVGLUT transporterxe2x80x9d refers to a member of a glutamate transporter family characterized by VGLUT1, VGLUT2, and VGLUT3. The VGLUT glutamate transporters belong to a larger family known as the type I phosphate transporters. However, particularly in view of the teachings provided herein, it is demonstrated that members of this family transport organic anions (such as sialic acid and glutamate) rather than inorganic phosphate. Within this family, the VGLUTs show much stronger sequence similarity ( greater than 50% amino acid identity to each other from C. elegans to mammals and  greater than 80% within mammals) than to other type I phosphate transporters such as sialin and NaPi-1 (35-45% amino acid identity). Thus, preferred VGLUT glutamate transporters of this invention show 50% or greater amino acid sequence identity, preferably 65% or greater amino acid sequence identity, more preferably 80% or greater amino acid sequence identity, still more preferably 90% or greater amino acid sequence identity, and most preferably 95% or greater amino acid sequence identity, to VGLUT1 and/or to VGLUT2 and/or to VGLUT3.
The term xe2x80x9cVGLUT nucleic acidxe2x80x9d refers to a nucleic acid encoding a VGLUT polypeptide (glutamate transporter) or to a nucleic acid derived therefrom. Thus, VGLUT nucleic acids include, but are not limited, to various VGLUT genes (e.g. VGLUT1, VGLUT2, and VGLUT3), a VGLUT RNA (e.g. VGLUT1 RNA, VGLUT2 RNA, and VGLUT3 RNA), a VGLUT cDNA, a VGLUT cRNA, and the like.
A xe2x80x9cVGLUT1 nucleicxe2x80x9d is a nucleic acid that encodes a polypeptide encoded by VGLUT1 (GenBank Accession No: AB032436) and homologs and orthologues thereof or to a nucleic acid derived therefrom. Thus, VGLUT1 nucleic acids include, but are not limited, to a VGLUT1 gene, a VGLUT1 cDNA, a VGLUT1 RNA, a VGLUT1 cRNA, an amplification produce produced from a VGLUT1 nucleic acid template, and the like. Similarly, a xe2x80x9cVGLUT2 nucleicxe2x80x9d is a nucleic acid that encodes a polypeptide encoded by VGLUT2 (GenBank Accession Nos: rat VGLUT2: AF271235; human VGLUT2: AB032435) and homologs and orthologues thereof or to a nucleic acid derived therefrom. A xe2x80x9cVGLUT3 nucleicxe2x80x9d is a nucleic acid that encodes a polypeptide encoded by VGLUT3 (GenBank Accession No: AL157942) and homologs and orthologues thereof or to a nucleic acid derived therefrom.
A xe2x80x9cVGLUT protein or polypeptidexe2x80x9d is a glutamate transporter protein encoded by a VGLUT nucleic acid. Similarly, a xe2x80x9cVGLUT1, VGLUT2, or VGLUT3 protein or polypeptidexe2x80x9d is a glutamate transporter protein encoded by a VGLUT1, VGLUT2, or VGLUT3 nucleic acid, respectively.
xe2x80x9cBNPIxe2x80x9d refers to a brain-specific inorganic phosphate transporter (see, e.g., Rosteck et al. (1994) Proc. Natl. Acad. Sci., USA, 91: 5607-5611; Glinn and Paul (1995) J. Neurochem, 65: 2358-2365 (1995); and Glinn et al.(1998) J. Neurochem., 70: 1850-1858). See also, GenBank accession number AB032436. BNPI is used herein synonymously with VGLUT1.
The phrase xe2x80x9cdetecting expression or activity of VGLUTxe2x80x9d refers to detecting expression of a VGLUT nucleic acid (e.g. VGLUT1, and/or VGLUT2, and/or VGLUT3), detecting expression of a VGLUT protein (e.g. a VGLUT1 polypeptide, and/or a VGLUT2 polypeptide, and/or VGLUT3 polypeptide), or detecting activity of a VGLUT polypeptide.
The term xe2x80x9cinhibit expressionxe2x80x9d when used with reference to inhibition of VGLUT (e.g. VGLUT1 and/or VGLUT2 and/or VGLUT3) refers to a reduction or blocking of VGLUT transcription, and/or translation, and/or formation or availability or activity of a VGLUT protein (e.g. VGLUT1 and/or VGLUT2 and/or VGLUT3).
The term xe2x80x9cdetecting a VGLUT mRNA or cDNAxe2x80x9d refers to detecting and/or quantifying a VGLUT nucleic acid or a nucleic acid derived therefrom the quantification of which provides an indication of the expression level of the VGLUT nucleic acid. The term thus includes, but is not limited to detection of VGLUT mRNA, cDNA, VGLUT amplification products, and fragments of any of these.
The terms xe2x80x9cbinding partnerxe2x80x9d, or xe2x80x9ccapture agentxe2x80x9d, or a member of a xe2x80x9cbinding pairxe2x80x9d refers to molecules that specifically bind other molecules to form a binding complex such as antibody-antigen, lectin-carbohydrate, nucleic acid-nucleic acid, biotin-avidin, etc.
The term xe2x80x9cspecifically bindsxe2x80x9d, as used herein, when referring to a biomolecule (e.g., protein, nucleic acid, antibody, etc.), refers to a binding reaction which is determinative of the presence biomolecule in heterogeneous population of molecules (e.g., proteins and other biologics). Thus, under designated conditions (e.g. immunoassay conditions in the case of an antibody or stringent hybridization conditions in the case of a nucleic acid), the specified ligand or antibody binds to its particular xe2x80x9ctargetxe2x80x9d molecule and does not bind in a significant amount to other molecules present in the sample.
The phrase xe2x80x9ctransport of glutamate into a cellxe2x80x9d refers to the uptake of glutamate into a synaptic vesicle (e.g. of a nerve cell), or the uptake of glutamate into other kinds of cells, as well. Thus, for example, transport of glutamate into a cell can refer to the transport of glutamate into an oocyte (e.g. an oocytes expressing a heterologous VGLUT transporter) in which case, uptake is across the plasma membrane. In certain preferred embodiments, uptake is uptake by a mammalian cell.
The terms xe2x80x9chybridizing specifically toxe2x80x9d and xe2x80x9cspecific hybridizationxe2x80x9d and xe2x80x9cselectively hybridize to,xe2x80x9d as used herein refer to the binding, duplexing, or hybridizing of a nucleic acid molecule preferentially to a particular nucleotide sequence under stringent conditions. The term xe2x80x9cstringent conditionsxe2x80x9d refers to conditions under which a probe will hybridize preferentially to its target subsequence, and to a lesser extent to, or not at all to, other sequences. Stringent hybridization and stringent hybridization wash conditions in the context of nucleic acid hybridization are sequence dependent, and are different under different environmental parameters. An extensive guide to the hybridization of nucleic acids is found in, e.g., Tijssen (1993) Laboratory Techniques in Biochemistry and Molecular Biologyxe2x80x94Hybridization with Nucleic Acid Probes part I, chapt 2, Overview of principles of hybridization and the strategy of nucleic acid probe assays, Elsevier, N.Y. (Tijssen ). Generally, highly stringent hybridization and wash conditions are selected to be about 5xc2x0 C. lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH. The Tm is the temperature (under defined ionic strength and pH) at which 50% of the target sequence hybridizes to a perfectly matched probe. Very stringent conditions are selected to be equal to the Tm. for a particular probe. An example of stringent hybridization conditions for hybridization of complementary nucleic acids which have more than 100 complementary residues on an array or on a filter in a Southern or northern blot is 42xc2x0 C. using standard hybridization solutions (see, e.g., Sambrook (1989) Molecular Cloning: A Laboratory Manual (2nd ed.) Vol. 1-3, Cold Spring Harbor Laboratory, Cold Spring Harbor Press, NY, and detailed discussion, below), with the hybridization being carried out overnight. An example of highly stringent wash conditions is 0.15 M NaCl at 72xc2x0 C. for about 15 minutes. An example of stringent wash conditions is a 0.2xc3x97 SSC wash at 65xc2x0 C. for 15 minutes (see, e.g., Sambrook supra.) for a description of SSC buffer). Often, a high stringency wash is preceded by a low stringency wash to remove background probe signal. An example medium stringency wash for a duplex of, e.g., more than 100 nucleotides, is 1xc3x97 SSC at 45xc2x0 C. for 15 minutes. An example of a low stringency wash for a duplex of, e.g., more than 100 nucleotides, is 4xc3x97 to 6xc3x97 SSC at 40xc2x0 C. for 15 minutes.
The term xe2x80x9ctest agentxe2x80x9d refers to an agent that is to be screened in one or more of the assays described herein. The agent can be virtually any chemical compound. It can exist as a single isolated compound or can be a member of a chemical (e.g. combinatorial) library. In a particularly preferred embodiment, the test agent will be a small organic molecule.
The term xe2x80x9csmall organic moleculexe2x80x9d refers to a molecule of a size comparable to those organic molecules generally used in pharmaceuticals. The term excludes biological macromolecules (e.g., proteins, nucleic acids, etc.). Preferred small organic molecules range in size up to about 5000 Da, more preferably up to 2000 Da, and most preferably up to about 1000 Da.
The term database refers to a means for recording and retrieving information. In preferred embodiments the database also provides means for sorting and/or searching the stored information. The database can comprise any convenient media including, but not limited to, paper systems, card systems, mechanical systems, electronic systems, optical systems, magnetic systems or combinations thereof. Preferred databases include electronic (e.g. computer-based) databases. Computer systems for use in storage and manipulation of databases are well known to those of skill in the art and include, but are not limited to xe2x80x9cpersonal computer systemsxe2x80x9d, mainframe systems, distributed nodes on an inter- or intranet, data or databases stored in specialized hardware (e.g. in microchips), and the like.
The term xe2x80x9cheterologousxe2x80x9d as it relates to nucleic acid sequences such as coding sequences and control sequences, denotes sequences that are not normally associated with a region of a recombinant construct, and/or are not normally associated with a particular cell. Thus, a xe2x80x9cheterologousxe2x80x9d region of a nucleic acid construct is an identifiable segment of nucleic acid within or attached to another nucleic acid molecule that is not found in association with the other molecule in nature. For example, a heterologous region of a construct could include a coding sequence flanked by sequences not found in association with the coding sequence in nature. Another example of a heterologous coding sequence is a construct where the coding sequence itself is not found in nature (e.g., synthetic sequences having codons different from the native gene). Similarly, a host cell transformed with a construct which is not normally present in the host cell would be considered heterologous for purposes of this invention.
The term xe2x80x9crecombinantxe2x80x9d or xe2x80x9crecombinantly expressedxe2x80x9d when used with reference to a cell indicates that the cell replicates or expresses a nucleic acid, or expresses a peptide or protein encoded by a nucleic acid whose origin is exogenous to the cell. Recombinant cells can express genes that are not found within the native (non-recombinant) form of the cell. Recombinant cells can also express genes found in the native form of the cell wherein the genes are re-introduced into the cell by artificial means, for example under the control of a heterologous promoter.
The terms xe2x80x9cidenticalxe2x80x9d or percent xe2x80x9cidentity,xe2x80x9d in the context of two or more nucleic acids or polypeptide sequences, refer to two or more sequences or subsequences that are the same or have a specified percentage of amino acid residues or nucleotides that are the same, when compared and aligned for maximum correspondence, as measured using one of the following sequence comparison algorithms or by visual inspection. With respect to the peptides of this invention sequence identity is determined over the full length of the peptide.
For sequence comparison, typically one sequence acts as a reference sequence, to which test sequences are compared. When using a sequence comparison algorithm, test and reference sequences are input into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated. The sequence comparison algorithm then calculates the percent sequence identity for the test sequence(s) relative to the reference sequence, based on the designated program parameters.
Optimal alignment of sequences for comparison can be conducted, e.g., by the local homology algorithm of Smith and Waterman, Adv. Appl. Math. 2:482 (1981), by the homology alignment algorithm of Needleman and Wunsch, J. Mol. Biol. 48:443 (1970), by the search for similarity method of Pearson and Lipman (1988) Proc. Natl. Acad. Sci. USA 85:2444, by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Dr., Madison, Wis.), or by visual inspection (see generally Ausubel et al., supra).
One example of a useful algorithm is PILEUP. PILEUP creates a multiple sequence alignment from a group of related sequences using progressive, pairwise alignments to show relationship and percent sequence identity. It also plots a tree or dendogram showing the clustering relationships used to create the alignment. PILEUP uses a simplification of the progressive alignment method of Feng and Doolittle (1987) J. Mol. Evol. 35:351-360. The method used is similar to the method described by Higgins and Sharp (1989) CABIOS 5: 151-153. The program can align up to 300 sequences, each of a maximum length of 5,000 nucleotides or amino acids. The multiple alignment procedure begins with the pairwise alignment of the two most similar sequences, producing a cluster of two aligned sequences. This cluster is then aligned to the next most related sequence or cluster of aligned sequences. Two clusters of sequences are aligned by a simple extension of the pairwise alignment of two individual sequences. The final alignment is achieved by a series of progressive, pairwise alignments. The program is run by designating specific sequences and their amino acid or nucleotide coordinates for regions of sequence comparison and by designating the program parameters. For example, a reference sequence can be compared to other test sequences to determine the percent sequence identity relationship using the following parameters: default gap weight (3.00), default gap length weight (0.10), and weighted end gaps.
Another example of algorithm that is suitable for determining percent sequence identity and sequence similarity is the BLAST algorithm, which is described in Altschul et al. (1990) J. Mol. Biol. 215: 403-410. Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/). This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score threshold (Altschul et al, supra). These initial neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them. The word hits are then extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always greater than 0) and N (penalty score for mismatching residues; always less than 0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative-scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses as defaults a wordlength (W) of 11, an expectation (E) of 10, M=5, N=xe2x88x924, and a comparison of both strands. For amino acid sequences, the BLASTP program uses as defaults a wordlength (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff and Henikoff (1989) Proc. Natl. Acad. Sci. USA 89:10915).
In addition to calculating percent sequence identity, the BLAST algorithm also performs a statistical analysis of the similarity between two sequences (see, e.g., Karlin and Altschul (1993) Proc. Natl. Acad. Sci. USA, 90: 5873-5787). One measure of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance. For example, a nucleic acid is considered similar to a reference sequence if the smallest sum probability in a comparison of the test nucleic acid to the reference nucleic acid is less than about 0.1, more preferably less than about 0.01, and most preferably less than about 0.001.
The term xe2x80x9coperably linkedxe2x80x9d as used herein refers to linkage of a promoter to a nucleic acid sequence such that the promoter mediates/controls transcription of the nucleic acid sequence.
The term xe2x80x9cinducexe2x80x9d expression refers to an increase in the transcription and/or translation of a gene or cDNA.