Biocatalysis, defined as the biological synthesis of the molecules in question enzymatically, has been becoming more popular by offering a strong alternative to chemical synthesis, in terms of cost, time, purification steps, and simplicity of use. The introduction of any new biocatalysis process on an industrial scale necessitates, however, (i) identifying the enzyme (or the enzymes) which make(s) it possible to specifically convert the substrate provided into the desired product, (ii) identifying the enzyme (or the enzymes) which make(s) it possible to implement the catalysis in a stable manner and in the particular conditions linked to the industrial process (thermostability, pH, or tolerance to denaturation conditions of organic solvents).
Due to their universal distribution, including in the most extreme environments, microorganisms are known for being able to perform totally original enzymatic functions and in conditions compatible with the industrial processes mentioned above.
However, the promising approach of exploiting these bacterial functions has always been considerably limited by a technological obstacle: the isolation and in vitro culture of the enormous potential offered by the bacterial diversity. Most bacteria developing in complex natural environments (soils and sediments, aquatic environments, digestive systems) have not been cultivated because their optimal culturing conditions are unknown or too difficult to reproduce. Numerous scientific works demonstrate this established fact, and it is now widely admitted that only between 0.1 and 1% of the bacterial diversity, including all environments, have been isolated and cultivated (Amann et al, 1995, Microb. Rev., 59: 143-169). Even if the search for novel biocatalytic pathways within collections of microbic strains has proved to be effective, it nevertheless has the major disadvantage of only exploiting a tiny part of the bacterial biodiversity.
New approaches have been developed in order to overcome this critical point of isolating bacteria in order to gain access to this enormous genetic potential offered by the adaptation systems of bacteria developed over their long evolution. This approach is called Metagenomics because it relates to a set of genomes from a bacterial community without any distinction (metagenome).
Metagenomics involves the direct extraction of DNA from environmental samples, their propagation and their expression in cultivatable bacterial hosts. Metagenomics in the strict sense was first of all used for identifying new bacterial phyla (Pace, 1997, Science, 276: 734-740). This approach is based upon the specific cloning of genes recognised for their phylogenetic interest, such as for example DNAr 16S. Other developments have been implemented in order to identify new enzymes of environmental or industrial interest (Terragen Diversity Patent No. U.S. Pat. No. 6,441,148). In these two approaches, metagenomics starts with a selection of the desired genes. This selection is made by a PCR (Polymerase Chain Reaction) approach, generally before the cloning step. In the latter case, the cloning vector is preferably an expression vector (i.e. it contains regulating sequences upstream of the cloned fragment of DNA, enabling it to express the cloned DNA in a give expression host).
More recent developments consider the metagenome as a whole. Thus, no selection and no identification is made before the metagenomic DNA library is created, in a totally random fashion. This approach therefore gives access to the whole genetic potential of the bacterial community being explored without any a priori.
In general, bacteria play an important role in the function of ecosystems. In fact, they are well represented quantitatively. For example, it is estimated that one gram of soil can contain between 1 000 and 10 000 different species of bacteria with between 107 and 109 cells, considering cultivatable and non-cultivatable bacteria. Reproducing this whole diversity in metagenomic DNA libraries requires the ability to generate and manage a large number of clones.
In this latter approach, the DNA libraries are made up of several dozen, hundreds of thousands, or even several million recombinant clones which differ from one another by the DNA which they have incorporated. For this, the average size of the cloned metagenomic inserts is of the utmost importance in the search for bacterial biosynthesis pathways because most of the time these pathways are organised in clusters in the bacteria. The larger the cloned fragments of DNA (larger than 30 Kb), the more the number of clones to be analysed is limited and the greater the possibility of reproducing complete metabolic pathways which make it possible to obtain the conversion of a substrate {A} into a target product {B} and into a source of growth.
Given the large number of recombinant clones to be studied and the number of trials to be carried out, numerous laboratories are tending to use high density hybridisation systems (high density membranes or DNA chips), in particular for the characterisation of bacterial communities (for a review, see Zhou et al., 2003, Curr. Opin. Microbial., 6: 288-294).
Even if none of these data relate to metagenomic libraries, they nevertheless provide a great deal of information such as the quantification of different functional genes (Cho et al., 2003), the study of functional genes and their diversity (Wu et al., 2001, Appl. Environ. Microbiol., 67: 5780-5790) and the direct detection of DNAr 16S genes (Small et al., 2001). Just one study relates to the use of metagenomics in combination with DNA chips (Sebat et al., 2003, Appl. Environ. Microbiol., 69: 4927-4934) for the identification of clones containing DNA which has come from non-cultivatable bacteria and their selection for additional analysis.
The screening of enzymatic activities or of antibacterial activities from metagenomic libraries has been widely described in the scientific literature. The studies have related, for example, to the direct detection of chitinase (Cottrell et al., 1999, Appl. Environ. Microbiol., 65: 2553-2557), lipase (Henne et al., 2000, Appl. Environ. Microbiol., 66: 3113-3116), DNA, and amylase (Rondon et al., 2000, Appl. Environ. Microbiol., 66: 2541-2547) activity. In these studies, the host bacteria containing the recombinant clones are placed in culture on a medium complemented by the substrate of which metabolisation is sought, and the screening of the activity is generally based upon the appearance of haloes or precipitates around the colonies, or by a change to the appearance of the colonies which are metabolising the substrate being studied. It should be noted that the enzymatic activities detected by means of these examples are new activities for the host bacterium, but are not essential for the growth of the latter in the examples provided. A similar approach was described in the patent (Chromaxome No. 5,783,431). This patent describes a method of screening activity based upon the encapsulation of individual or pooled clones from a library in a stable, inert and porous matrix (advantageously alginate), in the form of macro- or micro-droplets. The droplets are for example subjected to a liquid culture containing the nutritive elements necessary for bacterial growth and a substrate (for example X-glucosaminide, X-acetate, X-glucopyranoside) the metabolisation of which is expressed by the appearance of blue colouring.
Alternatively, the phenotypical screening described in the Proteus Patent (No FR 2 786 788) is based upon a prior preparation of the nucleic acid sequences encoding the target protein (upstream and downstream elements necessary for the transcription and translation of the target genes), the in vivo transcription and translation, and then the detection and measurement of the activity of the target proteins.
All of these screening methods require the use of high throughput systems because they involve subjecting all of the clones to the screening test in order to identify the clones in question which respond positively to the tests. For this purpose, the company Diversa, leader in the domain of the discovery of new molecules, has developed a unique platform, called the GigaMatrix, enabling ultra-high throughput screening, of around 1 billion clones per day See Worldwide Website: diversa.com/techplat/gigamatrix/default.asp).
Another approach has already been described in patent WO 00/22170 of Microgenomics (U.S. Pat. No. 6,368,793 B1). This patent describes a methodology for identifying a metabolic pathway transforming a substrate S into a desired product T by creating or identifying a genetically manipulated organism of which the capability of implementing this reaction is placed under the control of an inducible promoter. This organism is used for screening fragments of nucleic acids in order to detect a gene involved in the transformation of a substrate into a product. The implementation of this method requires the identification and genetic characterisation of the genes responsible for the degradation of T in the expression host so that they can be placed under the control of an inducible promoter. This type of construct cannot always be considered, in particular when the genes in question are spread over the genome and there is a possible risk of “leaking” into the inducer. On the other hand, it represents extremely hard work which has to be repeated for every study of a product T. Finally, in this approach, the organism used must be capable of incorporating and metabolising S and T. All of the elements mentioned demonstrate the limits of the efficacy of this type of approach.
The majority of these technologies, with the exception of that described by Microgenomics, therefore require the prior organisation of libraries, i.e. the individualisation, storage and preservation of the clones in formats compatible with the screening systems mentioned above. Moreover, the adequacy of a metagenomic library for a given problem (for example the search for a specific enzymatic function) can only be established when all of the clones making up this library have been subjected to the screening. Several hundreds of thousands of clones must often be screened in order maybe to detect just one clone of interest. The creation of a metagenomic library is in fact subject to a certain number of limitations, such as the prior choice of the environment being explored, the bacterial community (or communities) being considered within this environment, the cloning or expression vector, the sizes of the cloned inserts, and the host organism likely to best express the heterological metagenomic DNA.
The time required and the means used to create the metagenomic library and then its screening is therefore key, with small hope of success. An increase in the chances of discovery would involve, absolutely, the creation of a metagenomic library specific to each problem, in order to best respond to the objectives set.