Increasingly, bacterial genes are being used in various industrial and agricultural applications such as insect resistant crops, herbicide tolerant crops, or improved industrial processes. Bacteria are capable of carrying out virtually every known biochemical process and are therefore a good source of proteins and enzymes for use in a wide variety of commercial processes. Bacterial genes of utility include those that encode proteins with insecticidal activity, those that catalyze industrial processes, proteins responsible for antibiotic resistance and virulence factors. While use of biologically derived genes and proteins is increasing, it remains a cumbersome process to discover and characterize genes encoding proteins which are viable for commercial application. Traditional approaches to identify commercially viable genes and proteins have relied on following the function of interest. Newer genomics approaches have attempted to sequences genes as quickly as possible and identify their function by homology to known genes. It remains unclear how efficient it is to sequence entire genomes of a given organism to identify new genetic activities. Efforts to characterize the genomes of organisms have been ongoing since tools of molecular biology became available for this purpose. These studies often look at the relatedness of different species or at the degree of difference between two or more organisms. There have been no systematic efforts to characterize the specific genes carried by plasmids, small discrete genetic elements of bacteria, and to use such characterization as a means to rapidly identify bacterial genes with commercial applications.
Bacterial species often carry genetic elements called plasmids that include a variety of genes. Often these plasmid encoded genes give the strain of a given bacterium commercially important characteristics. For instance, many Bacillus thuringiensis (Bt) strains are used as microbial pesticides. The genes responsible for producing the insecticidal proteins of these strains are plasmid encoded. Bt strain HD-1 has been used for decades as a microbial spray against various lepidopteran pests. Since many genes of commercial utility reside on the plasmids, not within the chromosomal DNA, whole-genome based genomics approaches to discover new genes are inefficient because one repeatedly sequences the chromosomal DNA. A number of techniques have been developed to increase the efficiency of gene discovery.
The use of microarrays allows comparison of several species (the test strains) to a known, sequenced species (the reference strain). In order to perform this method, one must generate the entire DNA sequence of a genome (the reference genome), then synthesize oligonucleotides corresponding to much of the reference genome, and imbed these oligonucleotides onto a matrix, such as a chip. One drawback of this method is that one must have the DNA sequence of a closely related reference strain. Only regions of similarity are identified while regions of non-similarity must be inferred. Furthermore, this method does not provide a method to determine nucleotide sequences of the variant regions present in the test strain.
Polymorphism mapping involves digestion of the genome with rare restriction enzymes and separation of the resulting fragments on pulsed field (PFGE) or field inversion gels (FIGE). This method can be used to screen related strains to determine the relative level of relatedness, and to map regions that are dissimilar between strains. However, this method does not generate any sequence information about the novel regions present in strains.
Differential hybridization techniques have the ability to identify regions of difference between strains, and to identify clones likely to contain differences. However, differential hybridization techniques are well known for their technical difficulty. The presence of repetitive DNA elements in genomes can substantially interfere with this method. While differential hybridization techniques based on hybridization of bulk PCR reactions are somewhat more technically feasible, none of these techniques has been used for rapid testing and characterization of plasmid sequences.
Because of the enormous genetic diversity among bacterial plasmids, methods are needed to facilitate the rapid and efficient identification of useful nucleotide sequences. There is a need to identify more bacterial genes with commercial relevance for such applications and to do so rapidly and efficiently.