The invention relates to methods for identifying genes encoding signal sequences.
The demonstrated clinical utility of certain growth factors and cytokines, for example, insulin, erythropoietin, granulocyte-colony stimulating factor, granulocyte-macrophage colony stimulating factor, human growth hormone, interferon-beta, and interleukin-2 in the treatment of human disease has generated considerable interest in identifying novel proteins of this class.
Since growth factors and cytokines are secreted proteins, they often possess "signal sequences" at their amino terminal end. The signal sequence directs a secreted or membrane protein to a sub-cellular membrane compartment, the endoplasmic reticulum, from which the protein is dispatched for secretion from the cell or presentation on the cell surface. Techniques that detect signal sequences or nucleic acid sequences encoding a signal sequence have been employed as tools in the discovery of novel cytokines and growth factors.
Among the methods that have been used to identify secreted proteins are methods that rely on the homology between some secreted proteins. For example, DNA probes or PCR oligonucleotides that recognize sequence motifs present in genes encoding known secreted proteins have been used in screening assays to identify novel secreted proteins. In a related approach, homology-directed sequence searching of Expressed Sequence Tag (EST) sequences generated by high-throughput sequencing of specific cDNA libraries has been used to identify genes encoding secreted proteins. Both of these approaches can identify a signal sequence when there is a high degree of similarity between the DNA sequence used as a probe and the putative signal sequence.
"Signal peptide trapping" has also been used to identify secreted proteins (Tashiro et al., 1993, Science 261:600-603; Honjo et al., 1996; U.S. Pat. No. 5,525,486, and U.S. Pat. No. 5,536,637). Generically, this technique involves the ligation of cDNA, prepared from various mRNA sources, to a reporter gene lacking a signal sequence. The resulting chimeric constructs are introduced into an appropriate host cell. Depending upon the nature of the reporter gene, host cells are scored for either the presence of reporter protein at the cell surface or secretion of the reporter protein from cells. In both cases, a positive score indicates that the cell harbors a chimeric construct having a cDNA encoding a signal sequence which directs the export of the reporter protein to the cell surface or into the extracellular medium.
In a related method (Klein et al., 1996, Proc. Nat. Acad. Sci. USA 93:7108-7113; Jacobs, 1996, U.S. Pat. No. 5,536,637) the Saccharomyces cerevisiae gene, SUC2, which encodes a secreted invertase protein, is used as a reporter. Invertase catalyzes the hydrolysis of sucrose into glucose and fructose, sugars which, unlike sucrose, can be readily utilized by S. cerevisiae as a carbon source. Strains of S. cerevisiae that cannot secrete SUC2 protein are unable to grow on media with sucrose as the sole carbon source. Thus, a mutant SUC2 gene which does not encode a signal peptide can be used as a reporter in signal sequence trapping. Chimeric constructs composed of random cDNAs fused to DNA encoding SUC2 lacking a signal sequence are transformed into S. cerevisiae, and transformants secreting chimeric SUC2 are selected by growing the transformants under conditions where sucrose is the sole carbon source. This method offers a genetic selection for cDNAs encoding signal peptides.