Numerous agricultural and industrial production systems and processes depend on specific organisms, such as plants, algae, bacteria, fungi, yeasts, protozoa and cultured animal cells, for production of useful materials and compounds, such as food, fiber, structural materials, fuel, chemicals, pharmaceuticals, or feedstocks thereof. In the process of the current shift to biological production systems for a variety of chemicals and fuels, a wide assortment of organisms will be used for production, most of them microbes, with an increasing tendency towards photosynthetic organisms (Dismukes 2008). The ability to grow robustly, and the ability to efficiently produce the materials and compounds of interest, are desirable properties of these organisms.
Optimization of the growth of these organisms and augmentation of their yield of useful materials and compounds is an ongoing activity of many companies and individuals, with the goal of achieving a higher productivity or yield, or lower production cost of commercially important materials and compounds. Such improvements can occur through the modification of production systems, or through the modification of the organisms themselves.
Genetic or epigenetic changes in organisms can be particularly powerful ways of improving the organisms' performance and raising their productivities. All organisms in use by humans have been selected for specific genetic compositions that maximize their productivity and usefulness. In addition, various techniques can be employed to increase the range of characteristics or phenotypes displayed by these organisms, enabling the selecting of superior strains and varieties. Among these techniques are mutagenesis, genetic engineering, transgenesis, metabolic engineering, breeding, adaptive mutation and others. Application of such techniques has allowed rapid progress in the improvement of organisms.
Deregulating genetic checkpoints is a general strategy for modifying the growth properties and yield of useful organisms. Genetic checkpoints have generally evolved to allow an organism to alter its growth, metabolism or progression through the cell cycle, enabling it to survive periods of stress or nutrient limitation. In multicellular organisms, checkpoints are also in place to inhibit cell divisions once a tissue or organ is mature and fully formed. Relieving these checkpoints is often desirable for maximizing growth, yield and productivity of an organism being cultured or grown in cultivation, where conditions of stress may be controllable and avoidable.
Among the genetic engineering methods developed in the past are gain-of-function approaches, through which one or more homologous or heterologous polynucleotides are introduced into an organism's genome. Typically, such polynucleotides are constructed in a manner that the polynucleotide product will be overexpressed in the organism, thus imparting a novel or altered function to that organism. Mutagenesis can also result in gain-of-function changes in a cell or an organism, although such changes are rarer in response to mutagenesis than loss-of-function changes, in which the activity of a polynucleotide or polynucleotide product is impaired or destroyed by the genetic change.
Polynucleotides tend to have specific functions which are a product of the polynucleotide sequence and of the biochemical properties of the encoded RNA or protein. The sequence and biochemical properties of a protein or RNA govern its structure, biochemical activity, localization within a cell, and association with other cellular components, allowing appropriate activity of the protein or RNA, and proper regulation of that activity. Alteration of a polynucleotide sequence resulting in abnormal properties of the encoded protein or RNA, affecting its biochemical and structural properties, sub-cellular localization and/or association with other proteins or RNAs, can have profound consequences on the characteristics or phenotype of the organism. Polynucleotide fusions, involving joining of intact or partial open reading frames encoded by separate polynucleotides, is a known way of altering a polynucleotide sequence to change the properties of the encoded RNA or protein and to alter the phenotype of an organism.
There are two general mechanisms by which polynucleotide fusions can alter an organism's phenotype. These two mechanisms can be illustrated with the case of polynucleotide A (encoding protein A′) fused to polynucleotide B (encoding protein B′), in which proteins A and B have different functions or activities and/or are localized to different parts of the cell. The first mechanism applies to sub-cellular localization of the two proteins. The fusion protein encoded by the polynucleotide fusion of the two polynucleotides may be localized to the part of the cell where protein A′ normally resides, or to the part of the cell where protein B′ normally resides, or to both. This alteration of cellular distribution of the activities encoded by proteins A′ and B′ may cause a phenotypic change in the organism. A schematic illustration of the altered localization of two proteins as a result of their fusion is illustrated in FIG. 1.
The second general mechanism by which fusion proteins alter the phenotypic property of a cell or organism relates to the direct association of two different, normally separate functions or activities in the same protein. In the case of proteins A′ and B′, their fusion may lead to an altered activity of protein A′ or of protein B′ or of the multiprotein complex in which these proteins normally reside, or of combinations thereof. The altered activity includes but is not limited to: qualitative alterations in activity; altered levels of activity; altered specificities of activity; altered regulation of the activity by the cell; altered association of the protein with other proteins or RNA molecules in the cell, leading to changes in the cell's biochemical or genetic pathways. A schematic illustration of phenotypic changes arising in a cell as a consequence of expressing a fusion protein is shown in FIG. 1.
Gene fusions, the function-generating principle that the technology is based on, is not a regularly occurring biological mechanism (Ashby 2006, Babushok 2007, Whitworth 2009, Zhang 2009, Eisenbeis 2010), but it has been observed sufficiently often to confirm the validity of the strategy. Apart from occurring in evolutionary time, for example in the evolution of new gene sequences by exon shuffling (Gilbert 1978), gene fusions are frequent events in oncogenesis where the fusion of two proto-oncogenes contributes to uncontrolled cell proliferation of cancer cells (Mitelman 2004, Mitelman 2007, Rabbitts 2009, Inaki 2012). Examples of alteration of activity of a polynucleotide fusion are the BCL-ABL oncogene involved in promoting uncontrolled cell growth in chronic myeloid leukemia (Sawyers 1992, Melo 1996), the mixed-lineage leukemia (MLL) polynucleotides coding for Histone-lysine N-methyltransferase that are involved in aggressive acute leukemia (Marshalek 2011), prokaryotic two-component signal transduction proteins (Ashby 2006, Whitworth 2009) and multifunctional bacterial antibiotic resistance polynucleotides (Zhang 2009). Despite these examples, however, polynucleotide fusions are relatively rare in biology compared to other genetic changes such as point mutations and tend to occur at a frequency that is more appropriately measured over evolutionary time as opposed to per cell generation (Babushok 2007, Eisenbeis 2010). As a result, a system for creating artificial polynucleotide fusions has the potential to create many phenotypes that are rarely or never found in nature. Fusion proteins capable of bypassing a variety of genetic checkpoints in various useful organisms will allow the isolation of faster-growing and higher-yielding strains and varieties.
To date, no attempt has been made to take advantage of the function-generating capability of fusion genes or polypeptides in a large-scale and systematic manner. There are no published examples of large-scale collections of randomized, in-frame polynucleotide fusions. Previous examples of fusion proteins have been generated in a limited and directed fashion with specific outcomes in mind. The present invention describes the creation and use of systematic, randomized, large-scale and in-frame gene fusions or polynucleotide fusions for the purpose of altering gene function, generating new gene functions, new protein functions and/or generating novel phenotypes of interest in biological organisms.
The present invention is distinct from gene and protein evolution methods such as gene shuffling (Stemmer 1994, Stemmer 1994a) that randomly recombine homologous sequences in order to create new variants of specific genes and proteins. The present invention uses collections of sequences that are substantially non-homologous as input sequences to create random, recombinant and novel coding sequences.