There is a need in the art for improved systems for the expression and screening of a variety of peptides and proteins, including proteins and protein complexes that are heterologous to the host in which they are expressed, and particularly including proteins with two or more heterogeneous or heteromeric domains, subunits and/or constituents. There is a need in the art for improved systems for the expression and screening of mammalian proteins and even more particularly, human proteins, in order to elucidate protein-protein interactions or interactions of proteins with other molecules, for directed molecular evolution strategies, and/or for the production and selection of recombinant complex proteins and/or engineered proteins for research, diagnostic and/or therapeutic applications. To name just a few non-limiting examples, there is a continued need in the art for improved systems for the expression and screening of complex proteins including, but not limited to, immunoglobulins (antibodies, including fragments and/or domains and/or derivatives thereof), other receptors, enzymes, hormones, lymphokines and DNA binding proteins. For example, the ability to engineer and rapidly identify useful therapeutic and/or diagnostic antibodies (or fragments and/or domains and/or derivatives thereof) or to provide an affinity-based screen for the selection of a variety of receptors and/or ligands is highly desirable. Proteins having two or more heterogeneous or variable domains, subunits, or constituents are particularly challenging to engineer and efficiently screen. Having a rapid method to identify proteins, including complex proteins, that are useful as therapeutic, diagnostic and/or research tools for use in mammalian, and particularly, human, applications, is therefore invaluable.
Moreover, in order to effectively screen for certain proteins, especially those with highly variable domains (e.g., immunoglobulins, T cell receptors, MHC-peptide complexes), the expression and screening of very large libraries, including libraries of libraries, may be desirable, if not necessary, in order to be able to select the best candidates for further development or to cover all of the possible permutations of protein structures encompassed by the variability in the proteins. With large libraries, it can be especially difficult and/or prohibitively time-consuming and/or costly to produce sufficient quantities of proteins to effectively screen and select candidates from the large pool of candidates, and then perform additional evaluation as needed to identify the best candidates, and/or perform further screening and selection to ensure that all of the best candidates are identified from the original pool. Accordingly, being able to efficiently and effectively express and select a desired protein from a large pool of proteins (and further evolve such proteins to a preferred candidate, if desired) in a system that is economical (cost-effective) and provides results in a relatively short time frame, and then readily produce such proteins on an economical, large-scale production basis, is highly desirable. If most or all of these goals could all be achieved in a single host organism or cell, the advantages would be great. Accordingly, there remains a pressing need for new approaches to the characterization of proteins and polypeptides (the term “protein” as used hereinafter should be understood to encompass peptides and polypeptides as well), and to the design, identification, and/or modification, and isolation of the genes encoding these proteins, so as to enable the modification and/or production of the proteins.
One approach to the problem of expressing proteins or polypeptides is through the expression of a genomic DNA library in a bacterium such as E. coli, where the expressed proteins are screened for a property or activity of interest. This approach suffers from several serious disadvantages, one of which is that bacteria typically do not effectively express genes having introns. Eukaryotic genomes of higher organisms are generally too complex for comprehensive expression of DNA libraries in bacteria. When all eukaryotic species are considered, bacteria represent only about 0.3% of all known species (E. O. Wilson, “The Current State of Biological Diversity”, in Biodiversity, National Academy Press, Wash. D.C., 1988, Chapter 1); thus the fraction of the world's genetic diversity accessible to bacterial expression systems is extremely limited.
To avoid problems with introns, it is possible to prepare a cDNA library and express it in bacteria. However, this approach relies upon the presence of RNA transcripts, and any genes not actively being transcribed will not be represented in the library. Many desirable proteins are expressed only under specific conditions (e.g., virulence factors in pathogenic fungi) and these conditions may not exist at the time the mRNA is harvested. In order to obtain sufficient RNA to prepare a cDNA library, it is necessary to culture suitable quantities of the organism or host cell of interest. In contrast, sufficient genomic DNA can be obtained from a very small number of individual cells by PCR amplification, using either random primers or primers designed to favor certain classes of genes. Finally, genes that are highly expressed in an organism or host cell will tend to be over-represented in the mRNA, and thus over-represented at the expense of minimally-expressed genes, which are often some of the more interesting genes, in a cDNA library. In order to have a high level of coverage of the mRNA species present, a much larger number of clones must be screened if a cDNA library is employed instead of a genomic library, since the latter will have a more nearly equal representation of the variety of genes present. Clearly it is more desirable to screen a genomic DNA library if at all possible.
Also, most bacteria, including E. coli, are incapable of secretion of many proteins, and thus are undesirable as a host cell for screening purposes where the screening relies upon secretion of the gene product. An additional disadvantage for E. coli, and for bacterial hosts in general, is that prokaryotes cannot provide many of the post-translational modifications required for the activity of numerous eukaryotic proteins. Moreover, expression of complex multi-domain or multi-subunit proteins (e.g., immunoglobulin) is not readily feasible in E. coli. In addition to glycosylation, subunit cleavage, disulfide bond formation, and proper folding of proteins are examples of the post-translational processing often required to produce an active protein.
To ensure such processing one can sometimes use mammalian cells, but mammalian cells are difficult to maintain, require expensive media, and are not generally transformed with high efficiency, and development of stable production cell lines requires long timeframes. Such transformation systems are therefore not convenient for high-throughput screening of proteins, although efforts have been made to employ mammalian cells as hosts for cDNA library screening (Schouten et al., WO 99/64582). An approach involving fusion of transformed protoplasts with mammalian cells prior to library screening has been described (U.S. Pat. No. 5,989,814), but expression of the protein library occurs in bacteria or yeast prior to cell fusion. There have been efforts to modify glycosylation patterns enzymatically after expression in host cells (Meynial-SalIes and Combes, J. Biotechnol., 46:1-14 (1996)), but such methods must be tailored for specific products and are not suitable for expression of proteins from a DNA library. More recently, Maras et al., Eur. J. Biochem., 249:701-707 (1997) (see also U.S. Pat. No. 5,834,251) have described a strain of Trichoderma reesei engineered to express human GlcNAc transferase I. The enzyme transfers N-acetylglucosamine to mannose residues on other expressed exogenous proteins, a first step toward more closely approximating natural mammalian products.
The use of yeast as host cells solves some of the above problems, but introduces others. Yeast tend to hyper-glycosylate exogenous proteins (Bretthauer and Castellino, 1999, Biotechnol. Appl. Biochem. 30:193-200), and the altered glycosylation patterns often render expressed mammalian proteins highly antigenic (C. Ballou, in Molecular Biology of the Yeast Saccharomyces, J. Strathern et al., eds., Cold Spring Harbor Laboratory Press, NY, 1982, 335-360). Although yeast are capable of coping with a limited number of introns, they are not generally capable of handling complex genes from higher species such as vertebrates. Even genes from filamentous fungi are usually too complex for yeast to transcribe efficiently, and this problem is compounded by differences in expression and splicing sequences between yeast and filamentous fungi (see e.g., M. Innis et al., Science 1985 228:21-26). Despite these drawbacks, transformation and expression systems for yeast have been extensively developed, generally for use with cDNA libraries. Yeast expression systems have been developed which are used to screen for naturally secreted and membrane proteins of mammalian origin (Klein, et al., Proc. Natl. Acad. Sci. USA 1996 93:7108-7113; Treco, U.S. Pat. No. 5,783,385), and for heterologous fungal proteins (Dalboge and Heldt-Hansen, Mol. Gen. Genet. 243:253-260 (1994)) and mammalian proteins (Tekamp-Olson and Meryweather, U.S. Pat. No. 6,017,731).
Proper intron splicing, and glycosylation, folding, and other post-translational modifications of fungal gene products would be most efficiently handled by a fungal host species, making filamentous fungi superior hosts for screening genomic DNA from soil and other samples. It also makes them excellent hosts for the production of fungal enzymes of commercial interest, such as proteases, cellulases, and amylases. It has also been found that filamentous fungi are capable of transcribing, translating, processing, and secreting the products of other eukaryotic genes, including mammalian genes. The latter property makes filamentous fungi attractive hosts for the production of proteins of biomedical interest (e.g., antibodies, other receptors, hormones, etc.). Glycosylation patterns introduced by filamentous fungi more closely resemble those of mammalian proteins than do the patterns introduced by yeast. For these reasons, a great deal of effort has been expended on the development of fungal host systems for expression of heterologous proteins, and a number of fungal expression systems have been developed. For reviews of work in this area, see Maras et al., Glycoconjugate J., 16:99-107 (1999); Peberdy, Acta Microbiol. Immunol. Hung. 46:165-174 (1999); Kruszewsa, Acta Biochim. Pol. 46:181-195 (1999); Archer et al., Crit. Rev. Biotechnol. 17:273-306 (1997); and Jeenes et al., Biotech. Genet. Eng. Rev. 9:327-367 (1991).
High-throughput expression and assaying of DNA libraries derived from fungal genomes would also be of use in assigning functions to the many mammalian genes that are currently of unknown function. For example, once a fungal protein having a property of activity of interest is identified, the sequence of the encoding gene may be compared to the human genome sequence to look for homologous genes.
Yelton et al., U.S. Pat. No. 4,816,405, discloses the modification of filamentous Ascomycetes to produce and secrete heterologous proteins. Buxton et al., in U.S. Pat. No. 4,885,249, and in Buxton and Radford, Mol. Gen. Genet. 196:339-344 (1984), discloses the transformation of Aspergillus niger by a DNA vector that contains a selectable marker capable of being incorporated into the host cells. McKnight et al., U.S. Pat. No. 4,935,349, and Boel, in U.S. Pat. No. 5,536,661, disclose methods for expressing eukaryotic genes in Aspergillus involving promoters capable of directing the expression of heterologous genes in Aspergillus and other filamentous fungi. Royer et al., in U.S. Pat. No. 5,837,847, and Berka et al., in WO 00/56900, disclose expression systems for use in Fusarium venenatum employing natural and mutant Fusarium spp. promoters. Conneely et al., in U.S. Pat. No. 5,955,316, disclose plasmid constructs suitable for the expression and production of lactoferrin in Aspergillus. Cladosporium glucose oxidase had been expressed in Aspergillus (U.S. Pat. No. 5,879,921).
Similar techniques have been used in Neurospora. Lambowitz, in U.S. Pat. No. 4,486,533, discloses an autonomously replicating DNA vector for filamentous fungi and its use for the introduction and expression of heterologous genes in Neurospora. Stuart et al., describe co-transformation of Neurospora crassa spheroplasts with mammalian genes and endogenous transcriptional regulatory elements in U.S. Pat. No. 5,695,965, and an improved strain of Neurospora having reduced levels of extracellular protease in U.S. Pat. No. 5,776,730. Vectors for transformation of Neurospora are disclosed in U.S. Pat. No. 5,834,191. Takagi et al., describe a transformation system for Rhizopus in U.S. Pat. No. 5,436,158. Sisniega-Barroso et al., describe a transformation system for filamentous fungi in WO 99/51756, which employs promoters of the glutamate dehydrogenase genes from Aspergillus awamori. Dantas-Barbosa et al., FEMS Microbiol. Lett. 1998 169:185-190, describe transformation of Humicola grisea var. thermoidea to hygromycin B resistance, using either the lithium acetate method or electroporation.
Fungal expression systems in Aspergillus and Trichoderma, for example, are disclosed by Berka et al., in U.S. Pat. No. 5,578,463; see also Devchand and Gwynne, J. Biotechnol. 17:3-9 (1991) and Gouka et al., Appl. Microbiol. Biotechnol. 47:1-11 (1997). Examples of transformed strains of Myceliophthora thermophila, Acremonium alabamense, Thielavia terrestris and Sporotrichum cellulophilum are presented in WO 96/02563 and U.S. Pat. Nos. 5,602,004, 5,604,129 and 5,695,985, which describe certain drawbacks of the Aspergillus and Trichoderma systems. In addition, the fungal expression system described in U.S. Pat. No. 6,573,086 and PCT Publication No. WO 00/20555 describe a transformation system using filamentous fungal hosts that particularly describe an expression system using Chrysosporium hosts as well as other filamentous fungi.
Methods for the transformation of phyla other than Ascomycetes are known in the art; see for example Munoz-Rivas et al., Mol. Gen. Genet. 1986 205:103-106 (Schizophyllum commune); van de Rhee et al., Mol. Gen. Genet. 1996 250:252-258 (Agaricus bisporus); Arnau et al., Mol. Gen. Genet. 1991 225:193-198 (Mucor circinelloides); Liou et al., Biosci. Biotechnol. Biochem. 1992 56:1503-1504 (Rhizopus niveus); Judelson et al., Mol. Plant Microbe Interact. 1991 4:602-607 (Phytophthora infestans); and de Groot et al., Nature Biotechnol. 1998 16:839-842 (Agaricus bisporus).
In addition to the usual methods of transformation of filamentous fungi, such as for example protoplast fusion, Chakraborty and Kapoor, Nucleic Acids Res. 18:6737 (1990) describe the transformation of filamentous fungi by electroporation. De Groot et al., in Nature Biotechnol. 16: 839-842 (1998), describe Agrobacterium tumefaciens-mediated transformation of several filamentous fungi. Biolistic introduction of DNA into fungi has been carried out; see for example Christiansen et al., Curr. Genet. 29:100-102 (1995); Durand et al., Curr. Genet. 31:158-161 (1997); and Barcellos et al., Can. J. Microbiol. 44:1137-1141 (1998). The use of magnetic particles for “magneto-biolistic” transfection of cells is described in U.S. Pat. Nos. 5,516,670 and 5,753,477, and is expected to be applicable to filamentous fungi.
Most prior efforts in the field of filamentous fungal expression systems have been directed to the identification of strains suitable for industrial production of enzymes, and therefore attention has been focused on culture viscosity, stability of transformation, yield of heterologous protein per unit volume, and yield as a percentage of biomass. DNA libraries have been expressed in fungi; see for example Gems and Clutterbuck, Curr. Genet. 1993 24:520-524, where an Aspergillus nidulans library was expressed in A. nidulans, and Gems et al., Mol. Gen. Genet. 1994 242:467-471, where a genomic library from Penicillium was expressed in Aspergillus. The cloning of an Aspergillus niger invertase gene by expression in Trichoderma reesei was described by Berges et al., Curr. Genet. 1993 24:53-59.
U.S. patent application Ser. No. 09/548,938, now U.S. Pat. No. 6,573,086; U.S. patent application Ser. No. 09/834,434, now U.S. Pat. No. 7,122,330; PCT Publication No. WO/0125468; PCT Publication No. WO/0179558; and PCT Publication No. NL/99/00618, described a system for expression of heterologous proteins in fungal host cells, and methods for expressing the gene products of a DNA library, including genomic and/or eukaryotic genomic DNA libraries. These applications also disclose mutant fungal strains that have partially lost their filamentous phenotype and thus provide low-viscosity cultures.
The present invention fulfills a continued need in the art for improved fungal host cell strains, vectors, and methods for the expression and screening of complex DNA libraries, including combinatorial libraries expressing proteins having one, two or more domains, subunits, or constituents, such as immunoglobulins and other receptors or protein complexes.