General Background. Proteins are the most prominent biomolecules in living organisms; in addition to their role as structural components and catalysts, they play a crucial role in regulatory processes. Both regulation of cell proliferation and metabolic functions are largely controlled and effected by the cooperation of numerous cellular and extracellular proteins. Lehninger, A. L., 1975, "Biochemistry", Worth Publishers Inc., New York, N.Y. For example, signal transduction pathways of many kinds that affect critical physiological responses operate through proteins by way of their intermolecular interactions. Metzler, D. E., 1977, "Biochemistry", Academic Press Inc., London. Furthermore, the transcription of genes and the regulation of such transcription is dependent upon and controlled by the interdependence of numerous protein factors. Wainwright, S. D., "Control Mechanisms and Protein Synthesis", Columbia University Press, New York and London.
Proper functioning of a multicellular organism does not only depend on the interaction of biomolecules within the cell, but individual cells must also communicate appropriately. Such intercellular communication, and interaction of cells with the environment is often realized by the actions of receptors on the extracellular surface and associated intracellular signal transduction mechanisms. Poste, G., Nicholson, G. L., 1976, "The Cell Surface", Elseviere, Amsterdam. The information is communicated through the cell environment to regulate gene expression or protein activities in the cell. Secreted proteins in the extracellular environment thereby exert potent regulatory effects on certain cellular functions.
In view of the above outlined, very simplified paradigm of cell function, particular properties of a protein, including cellular localization, structure, affinity to a binding partner, or enzymatic activity under physiological conditions appear to be highly indicative of its type of function. With respect to a particular cellular localization, secreted proteins, for example, are likely to function as intercellular communicators of signals, while membrane associated receptors having an extracellular and intracellular domain most likely transmit an extracellular signal into the cell. Cytoplasmic proteins may function as intracellular signal transmitters and coordinators. Jeter, J. R., Cameron, I. L., Padilla, I. L., Padilla, G. M., Zimmerman, A. M., 1978, "Cell Cycle Regulation", London. Nuclear proteins are likely to be involved in certain aspects of gene regulation. Zawel et al., 1995, Annu. Rev. Biochem. 64:533-561. Mature proteins found in the Golgi or the ER may have regulatory roles in the post-translational processing of protein precursors, e.g., cleavage or addition of carbohydrates. Hirschberg, 1987, C. Annu. Rev. Biochem. 56:63-87.
Membrane-Associated Proteins. For many years, the paradigm of cell function has motivated numerous drug discovery programs to focus on identifying membrane-associated proteins, in particular new receptors, and their respective functions. Porter, R. and O'Connor, M., 1970, "Molecular Properties of Drug Receptors, Ciba Foundation Symposium", J&A Churchill, London. Many examples in fact compel the conclusion that improper function of membrane receptors is a significant source of the development of serious metabolic and proliferative diseases such as cancer. For example, a certain form of Diabetes mellitus, i.e., the non-insulin-dependent diabetes (NIDDM) may be caused by mutations in the insulin receptor. Ullrich et al., 1985, Nature 313:756-761; Taira et al., 1989, Science 245:63-66. Furthermore, 30% of all mammary carcinomas are associated with amplification of the receptor tyrosine kinase HER2. Bargman et al., 1986, Cell 45:649-657; Slamon et al., 1989, Science 244:707-712. In addition to traditional drug discovery programs targeting receptors, an ambitiously pursued objective has become to identify membrane-associated receptors as possible gene therapy targets using comparative genomics, which allows determination of changes in gene expression under, e.g., pathological conditions. Wels, et al., 1995, Gene 159(1):73-80.
Secreted Proteins. While receptors have mostly been considered as important potential therapeutic targets, secreted proteins are of particular interest as potential therapeutic agents. They often have a signalling or hormone function, and hence have a high and specific biological activity. Schoen, F. J., 1994, "Robbins Pathologic Basis of Disease", W. B. Saunders Company, Philadelphia. For example, secreted proteins control physiological reactions such as differentiation and proliferation, blood clotting and thrombolysis, somatic growth and cell death, and immune response. Schoen, F. J., 1994, "Robbins Pathologic Basis of Disease", W. B. Saunders Company, Philadelphia.
Significant resources and research efforts have been expended for the discovery and investigation of new secreted proteins controlling biological functions. Many of such secreted proteins, including cytokines and peptide hormones, are manufactured and used as therapeutic agents. Zavyalov et al., 1997, APMIS 105(3):161-186. However, of the several thousand expected secreted proteins, only a few are currently used as therapeutic compounds. It can be expected that many of the so far undiscovered secreted proteins of the human organism are effective in correcting physiological disorders and are thus promising candidates for new drugs.
In the past, novel cytokines and hormone proteins were identified by assaying a certain cell type for its response to protein fractions or purified proteins. Lauffenburger et al., 1996, Biotechnology and Bioengineering 52(1):61-80. Other investigators have used sequence similarities on DNA level to clone novel interferons and interleukins. Nabori et al., 1992, Analyt. Biochem. 205(1):42-46. In again another approach, differential display techniques were used to compare the expression patterns of stimulated versus unstimulated cells. Nagata et al., 1980, Nature 287:401-408. All these methods may yield identification and isolation of certain secreted polypeptides.
Recently, a screening method for the identification of cDNA encoding novel secreted mammalian proteins in yeast using the invertase gene as a selection marker has been described. See, U.S. Pat. No. 5,536,637 (the "'637 patent"). The disclosed technology relies on the concept that leader sequences of mammalian cDNAs are effective in exporting the invertase protein depleted of its leader sequence. This approach yields partial cDNAs which in turn can be used to screen a fill-length cDNA library. The novel protein of interest can then be manufactured by standard, but laborious, techniques, including subcloning, transforming a recombinant host, expression, development and implementation of a purification process. Furthermore, since the assays described in the '637 patent are performed in yeast, the glycosylation pattern of the isolated products will differ significantly from the natural product produced in mammalian cells. This difference is a major impediment in view of the fact that an extremely important feature of secreted proteins (as it is true for the extracellular domain of receptors) is their glycosylation pattern and carbohydrate composition. Rademacher et al., 1988, Annu. Rev. Biochem. 57:785-838.
Nuclear Proteins. In the nucleus, both replication of DNA and transcription of genes is actually implemented. Many nuclear proteins are directly involved in these processes as transcription factors, as cell cycle regulators, or both. Some nuclear proteins are responsible for turning on expression of certain metabolic proteins in response to environmental changes. Zawel et al., 1995, Annu. Rev. Biochem. 64:533-561. Many others are directly involved in the regulation of cell proliferation. Jeter, J. R., Cameron, I. L., Padilla, G. M., Zimmerman, A. M., 1978, "Cell Cycle Regulation", London. Proteins in this latter class fall into two general categories: (1) dominant transforming genes, including oncogenes; and (2) recessive cell proliferation genes, including tumor suppressor genes and genes encoding products involved in programmed cell death ("apoptosis").
Oncogenes generally encode proteins that are associated with the promotion of cell growth. Because cell division is a crucial part of normal tissue development and continues to play an important role in tissue regeneration, properly regulated oncogene activity is essential for the survival of the organism. However, inappropriate expression or improperly controlled activation of oncogenes may drive uncontrolled cell proliferation and result in the development of severe diseases, such as cancer. Weinberg, 1994, CA Cancer J. Clin. 44:160-170.
Tumor suppressor genes, on the other hand, normally act as "brakes" on cell proliferation, thus opposing the activity of oncogenes. Accordingly, inactivation of tumor suppressor genes, e.g., through mutations or the removal of their growth inhibitory effects may result in the loss of growth control, and cell proliferative diseases such as cancer may develop. Weinberg, 1994, CA Cancer J. Clin. 44:160-170.
Related to tumor suppressor genes are genes whose product is involved in the control of apoptosis; rather than regulating proliferation of cells, they influence the survival of cells in the body. In normal cells, surveillance systems are believed to ensure that the growth regulatory mechanisms are intact; if abnormalities are detected, the surveillance system switches on a suicide program that culminates in apoptosis.
Several genes that are involved in the process of apoptosis have been described. See, for example, Collins and Lopez Rivas, 1993, TIBS 18:307-308; Martin et al., 1994, TIBS 19:26-30. Cells that are resistant to apoptosis have an advantage over normal cells, and tend to outgrow their normal counterparts and dominate the tissue. As a consequence, inactivation of genes involved in apoptosis may result in the progression of tumors, and, in fact, is an important step in tumorigenesis.
Accordingly, the identification of nuclear proteins addresses two areas of interest. First, oncogenes are prone to be valuable targets for the development of highly specific drugs for the treatment of cancer. Secondly, tumor suppressors and apoptosis inducing proteins can be useful as agents for the treatment of cancer.
Other Cellular Localizations. Also many of the remaining cellular localizations are associated with particular functions. For example, most metabolic enzymes are located in the mitochondria. Lehninger, A. L., 1975, "Biochemistry", Worth Publishers Inc., New York. Thus, mitochondrial proteins could reveal targets for the treatment of metabolic diseases. The Golgi apparatus and the ER are associated with post-translational processing of proteins; such processes are valuable targets for the treatment of diseases related to protein folding and glycosylation. Rademacher et al., 1988, Ann. Rev. Biochem. 57:785-838.
Enzymatic Activities. A particular enzymatic activity can be indicative of a protein's function. For example, kinases are frequently involved in signal transduction processes.
Structures. Protein is also indicative of protein function. Indeed, proteins with similar structures frequently share certain functional properties. Thus, identifying proteins having a structure related to that of a known protein having a particular function of interest can reveal additional proteins having such function.
Generally, a method would be desirable which allows one to pre-sort proteins according to a property of interest, e.g., localization in the cell, affinity to binding partners, enzymatic activity, structure and the like. Such a method would allow one to generate libraries of, e.g., secreted proteins, such as cytokines, membrane associated proteins, such as receptors, nuclear proteins, such as transcription factors, mitochondrial proteins, such as respiratory proteins, and so on. Further, it would be desirable to perform screening, isolation and production of any product in mammalian cells in order to achieve the proper glycosylation pattern. Finally, the most preferred method would additionally allow one to identify and isolate proteins of interest per se rather than a partial DNA sequence.
The present invention addresses this need. The present invention provides methods and expression systems for the generation of expression libraries encoding polypeptides of a predetermined property, including but not limited to cellular localization, structure, enzymatic function, or affinity to other molecules. The methods and expression systems of the present invention allow one to identify and isolate nucleic acids encoding novel proteins of interest. The methods and expression systems furthermore provide a powerful system for the identification of thus far unelucidated receptor/ligand relationships. Since the methods provided can employed in a wide variety of host cell systems, including mammalian systems, they provide for expression products having an appropriate carbohydrate composition.