The present invention relates to the discovery of new bio-active molecules, such as antibiotics, anti-virals, anti-tumor agents and regulatory proteins. More particularly, the invention relates to a system for capturing genes potentially encoding novel biochemical pathways of interest in prokaryotic systems, and screening for these pathways utilizing high throughput screening assays.
Within the last decade there has been a dramatic increase in the need for bioactive compounds with novel activities. This demand has arisen largely from changes in worldwide demographics coupled with the clear and increasing trend in the number of pathogenic organisms that are resistant to currently available antibiotics. For example, while there has been a surge in demand for antibacterial drugs in emerging nations with young populations, countries with aging populations, such as the US, require a growing repertoire of drugs against cancer, diabetes, arthritis and other debilitating conditions. The death rate from infectious diseases has increased 58% between 1980 and 1992 and it has been estimated that the emergence of antibiotic resistant microbes has added in excess of $30 billion annually to the cost of health care in the US alone. (Adams et al., Chemical and Engineering News, 1995; Amann et al., Microbiological Reviews, 59, 1995). As a response to this trend pharmaceutical companies have significantly increased their screening of microbial diversity for compounds with unique activities or specificities.
There are several common sources of lead compounds (drug candidates), including natural product collections, synthetic chemical collections, and synthetic combinatorial chemical libraries, such as nucleotides, peptides, or other polymeric molecules. Each of these sources has advantages and disadvantages. The success of programs to screen these candidates depends largely on the number of compounds entering the programs, and pharmaceutical companies have to date screened hundred of thousands of synthetic and natural compounds in search of lead compounds. Unfortunately, the ratio of novel to previously-discovered compounds has diminished with time. The discovery rate of novel lead compounds has not kept pace with demand despite the best efforts of pharmaceutical companies. There exists a strong need for accessing new sources of potential drug candidates.
The majority of bioactive compounds currently in use are derived from soil microorganisms. Many microbes inhabiting soils and other complex ecological communities produce a variety of compounds that increase their ability to survive and proliferate. These compounds are generally thought to be nonessential for growth of the organism and are synthesized with the aid of genes involved in intermediary metabolism hence their namexe2x80x94xe2x80x9csecondary metabolitesxe2x80x9d. Secondary metabolites that influence the growth or survival of other organisms are known as xe2x80x9cbioactivexe2x80x9d compounds and serve as key components of the chemical defense arsenal of both micro- and macroorganisms. Humans have exploited these compounds for use as antibiotics, antiinfectives and other bioactive compounds with activity against a broad range of prokaryotic and eukaryotic pathogens. Approximately 6,000 bioactive compounds of microbial origin have been characterized, with more than 60% produced by the gram positive soil bacteria of the genus Streptomyces. (Barnes et al., Proc. Nat. Acad. Sci. U.S.A., 91, 1994). Of these, at least 70 are currently used for biomedical and agricultural applications. The largest class of bioactive compounds, the polyketides, include a broad range of antibiotics, immunosuppressants and anticancer agents which together account for sales of over $5 billion per year.
Despite the seemingly large number of available bioactive compounds, it is clear that one of the greatest challenges facing modern biomedical science is the proliferation of antibiotic resistant pathogens. Because of their short generation time and ability to readily exchange genetic information, pathogenic microbes have rapidly evolved and disseminated resistance mechanisms against virtually all classes of antibiotic compounds. For example, there are virulent strains of the human pathogens Staphylococcus and Streptococcus that can now be treated with but a single antibiotic, vancomycin, and resistance to this compound will require only the transfer of a single gene, vanA, from resistant Enterococcus species for this to occur. (Bateson et al., System. Appl. Microbiol, 12, 1989). When this crucial need for novel antibacterial compounds is superimposed on the growing demand for enzyme inhibitors, immunosuppressants and anti-cancer agents it becomes readily apparent why pharmaceutical companies have stepped up their screening of microbial diversity for bioactive compounds with novel properties.
The approach currently used to screen microbes for new bioactive compounds has been largely unchanged since the inception of the field. New isolates of bacteria, particularly gram positive strains from soil environments, are collected and their metabolites tested for pharmacological activity. A more recent approach has been to use recombinant techniques to synthesize hybrid antibiotic pathways by combining gene subunits from previously characterized pathways. This approach, called xe2x80x9ccombinatorial biosynthesisxe2x80x9d has focused primarily on the polyketide antibiotics and has resulted in a number of structurally unique compounds which have displayed activity. (Betz et al., Cytometry, 5, 1984; Davey et al., Microbiological Reviews, 60, 1989). However, compounds with novel antibiotic activities have not yet been reported; an observation that may be do to the fact that the pathway subunits are derived from those encoding previously characterized compounds. Dramatic success in using recombinant approaches due to small molecule synthesis has been recently reported in the engineering of biosynthetic pathways to increase the production of desirable antibiotics. (Diaper et al., Appl. Bacteriol., 77, 1994; Enzyme Nomenclature, Academic Press: NY, 1992).
There is still tremendous biodiversity that remains untapped as the source of lead compounds. However, the currently available methods for screening and producing lead compounds cannot be applied efficiently to these under-explored resources. For instance, it is estimated that at least 99% of marine bacteria species do not survive on laboratory media, and commercially available fermentation equipment is not optimal for use in the conditions under which these species will grow, hence these organisms are difficult or impossible to culture for screening or re-supply. Recollection, growth, strain improvement, media improvement and scale-up production of the drug-producing organisms often pose problems for synthesis and development of lead compounds. Furthermore, the need for the interaction of specific organisms to synthesize some compounds makes their use in discovery extremely difficult. New methods to harness the genetic resources and chemical diversity of these untapped sources of compounds for use in drug discovery are very valuable. The present invention provides a path to access this untapped biodiversity and to rapidly screen for activities of interest utilizing recombinant DNA technology. This invention combines the benefits associated with the ability to rapidly screen natural compounds with the flexibility and reproducibility afforded with working with the genetic material of organisms.
The present invention allows one to identify genes encoding bioactivities of interest from complex environmental gene expression libraries, and.to manipulate cloned pathways to evolve recombinant small molecules with unique activities. Bacteria and many eukaryotes have a coordinated mechanism for regulating genes whose products are involved in related processes. The genes are clustered, in structures referred to as xe2x80x9cgene clusters,xe2x80x9d on a single chromosome and are transcribed together under the control of a single regulatory sequence, including a single promoter which initiates transcription of the entire cluster. The gene cluster, the promoter, and additional sequences that function in regulation altogether are referred to as an xe2x80x9coperonxe2x80x9d and can include up to 20 or more genes, usually from 2 to 6 genes. Thus, a gene cluster is a group of adjacent genes that are either identical or related, usually as to their function. Gene clusters are of interest in drug discovery processes since product(s) of gene clusters include, for example, antibiotics, antivirals, antitumor agents and regulatory proteins.
Some gene families consist of one or more identical members. Clustering is a prerequisite for maintaining identity between genes, although clustered genes are not necessarily identical. Gene clusters range from extremes where a duplication is generated of adjacent related genes to cases where hundreds of identical genes lie in a tandem array. Sometimes no significance is discemable in a repetition of a particular gene. A principal example of this is the expressed duplicate insulin genes in some species, whereas a single insulin gene is adequate in other mammalian species.
Gene clusters undergo continual reorganization and, thus, the ability to create heterogeneous libraries of gene clusters from, for example, bacterial or other prokaryote sources is valuable in determining sources of novel bioactivities, including enzymes such as, for example, the polyketide synthases that are responsible for the synthesis of polyketides having a vast array of useful activities.
Polyketides are molecules which are an extremely rich source of bioactivities, including antibiotics (such as tetracyclines and erythromycin), anti-cancer agents (daunomycin), immunosuppressants (FK506 and rapamycin), and veterinary products (monensin). Many polyketides (produced by polyketide synthases) are valuable as therapeutic agents. Polyketide synthases (PKSs) are multifunctional enzymes that catalyze the biosynthesis of a huge variety of carbon chains differing in length and patterns of functionality and cyclization. Despite their apparent structural diversity, they are synthesized by a common pathway in which units derived from acetate or propionate are condensed onto the growing chain in a process resembling fatty acid biosynthesis. The intermediates remain bound to the polyketide synthase during multiple cycles of chain extension and (to a variable extent) reduction of the (xcex2-ketone group formed in each condensation. The structural variation between naturally occurring polyketides arises largely from the way in which each PKS controls the number and type of units added, and from the extent and stereochemistry of reduction at each cycle. Still greater diversity is produced by the action of regiospecific glycosylases, methyltransferases and oxidative enzymes on the product of the PKS.
Polyketide synthase genes fall into gene clusters. At least one type (designated type I) of polyketide synthases have large size genes and encoded enzymes, complicating genetic manipulation and in vitro studies of these genes/proteins. Progress in understanding the enzymology of such type I systems have previously been frustrated by the lack of cell-free systems to study polyketide chain synthesis by any of these multienzymes, although several partial reactions of certain pathways have been successfully assayed in vitro. Cell-free enzymatic synthesis of complex polyketides has proved unsuccessful, despite more than 30 years of intense efforts, presumably because of the difficulties in isolating fully active forms of these large, poorly expressed multifunctional proteins from naturally occurring producer organisms, and because of the relative lability of intermediates formed during the course of polyketide biosynthesis. In an attempt to overcome some of these limitations, modular PKS subunits have been expressed in heterologous hosts such as Escherichia coli and Streptomyces coelicolor. Whereas the proteins expressed in E. coli are not fully active, heterologous expression of certain PKSs in S. coelicolor resulted in the production of active protein. Cell-free enzymatic synthesis of polyketides from PKSs with substantially fewer active sites, such as the 6-methylsalicylate synthase, chalcone synthase, tetracenomycin synthase, and the PKS responsible for the polyketide component of cyclosporin, have been reported.
Hence, studies have indicated that in vitro synthesis of polyketides is possible, however, synthesis was always performed with purified enzymes. Heterologous expression of genes encoding PKS modular subunits have allowed synthesis of functional polyketides in vivo, however, there are several challenges presented by this approach, which had to be overcome. The large sizes of modular PKS gene clusters ( greater than 30 kb) make their manipulation on plasmids difficult. Modular PKSs also often utilize substrates which may be absent in a heterologous host. Finally, proper folding, assembly, and posttranslational modification of very large foreign polypeptides are not guaranteed.
Novel systems to clone and screen for bioactivities of interest in vitro are desirable. The method(s) of the present invention allow the cloning and discovery of novel bioactive molecules in vitro, and in particular novel bioactive molecules derived from uncultivated samples. Large size gene clusters can be cloned and screened using the method(s) of the present invention. Unlike previous strategies, the method(s) of the present invention allow one to clone utilizing well known genetic systems, and to screen in vitro with crude (impure) preparations.
The present invention allows one to clone genes potentially encoding novel biochemical pathways of interest in prokaryotic systems, and screen for these pathways utilizing a novel process. Sources of the genes may be isolated, individual organisms (xe2x80x9cisolatesxe2x80x9d), collections of organisms that have been grown in defined media (xe2x80x9cenrichment culturesxe2x80x9d), or, most preferably; uncultivated organisms (xe2x80x9cenvironmental samplesxe2x80x9d). The use of a culture-independent approach to directly clone genes encoding novel bioactivities from environmental samples is most preferable since it allows one to access untapped resources of biodiversity.
xe2x80x9cEnvironmental librariesxe2x80x9d are generated from environmental samples and represent the collective genomes of naturally occurring organisms archived in cloning vectors that can be propagated in suitable prokaryotic hosts. Because the cloned DNA is initially extracted directly from environmental samples, the libraries are not limited to the small fraction of prokaryotes that can be grown in pure culture. Additionally, a normalization of the environmental DNA present in these samples could allow more equal representation of the DNA from all of the species present in the original sample. This can dramatically increase the efficiency of finding interesting genes from minor constituents of the sample which may be under-represented by several orders of magnitude compared to the dominant species.
In the evaluation of complex environmental expression libraries, a rate limiting step occurs at the level of discovery of bioactivities. The present invention allows the rapid screening of complex environmental expression libraries, containing, for example, thousands of different organisms.
In the present invention, for example, gene libraries generated from one or more uncultivated microorganisms are screened for an activity of interest. Potential pathways encoding bioactive molecules of interest are first captured in prokaryotic cells in the form of gene expression libraries; crude or partially purified extracts, or pure proteins from metabolically rich cell lines are then combined with the gene expression libraries to create potentially active molecules; and the combination is screened for an activity of interest. Common approaches to drug discovery involve screening assays in which disease targets (macromolecules implicated in causing a disease) are exposed to potential drug candidates which are tested for therapeutic activity. In other approaches, whole cells or organisms that are representative of the causative agent of the disease, such as bacteria or tumor cell lines, are exposed to the potential candidates for screening purposes. Any of these approaches can be employed with the present invention.
The present invention also allows for the transfer of cloned pathways derived from uncultivated samples into metabolically rich hosts for heterologous expression and downstream screening for bioactive compounds of interest using a variety of screening approaches briefly described above.
Accordingly, in one aspect, the present invention provides a process for identifying clones encoding a specified activity of interest, which process comprises (i) generating one or more expression libraries derived from nucleic acid directly isolated from the environment; and (ii) combining the expression libraries with crude or partially purified extracts, or pure proteins from metabolically rich cell lines; and (iii) screening said libraries utilizing any of a variety of screening assays to identify said clones.
In another aspect, the present invention provides a process for identifying clones encoding a specified activity of interest, which process comprises (i) generating one or more expression libraries derived from nucleic acid directly isolated from the environment; and (ii) transferring the clones into a metabolically rich cell line; and (iii) screening said cell line utilizing any of a variety of screening assays to identify said clones.