Complex microbial communities are often characterized by culture-independent methods, frequently based on 16S rRNA gene heterogeneity. Other gene targets can be exploited in these studies and the gene encoding the 60 kDa chaperonin, cpn60 (also called groEL or hsp60) has proven particularly useful since a phylogenetically informative fragment of the gene can be amplified with degenerate PCR primers and this target region (corresponding to nucleotides 274 and 828 of the E. coli cpn60 sequence) generally provides more discriminating power than 16S rRNA sequences, especially for closely related organisms (Goh et al., 1997; Goh et al., 2000; Brousseau et al., 2001). The cpn60 target has been employed in studies of complex microbial communities (Hill et al., 2002; Hill et al., 2005a; Hill et al., 2005b) and a large reference database of chaperonin sequences is now available (Hill et al., 2004).
Staley and Konopka (1985) used the term “great plate count anomaly” to describe the fact that many members of complex microbial communitites cannot be cultured in the laboratory and are therefore not represented in culture-based studies of these communitites. However, it has also been observed that some organisms that can be cultured from complex communities are not detected in culture-independent studies. For example, Bifidobacterium spp. have been identified in porcine and human feces at levels of 10 cfu/g (Benno et al., 1985; Hartemink and Rombouts, 1999). However, these organisms and other high G+C content organisms expected to be present, such as other Actinobacteria, were not detected in PCR and sequence-based studies of human or porcine feces using either the 16S rRNA gene or cpn60 as a target (Wilson and Blitchington, 1996; Suau et al., 1999; Hill et al., 2002). Furthermore, in the cpn60-based study, sequences identical to Bifidobacterium spp. were detected in the template DNA mixture using genus-specific primers, indicating that the failure to detect these sequences in the library was not completely accounted for by a failure to isolate genomic DNA from the organisms during template preparation from the starting material. Organisms in the Bifidobacterium genus have G+C contents approximately 60% (58-61% for 16S rRNA sequences of 24 strains; 59-64% for partial cpn60 sequences of 84 strains. A major contributor to the under-representation of high G+C organisms in PCR product libraries may in fact be the relative inefficiency of Taq DNA polymerase amplification from these templates.
Another likely contributing factor to the under-representation of high G+C content organisms in PCR product libraries is primer annealing bias. The primers used for amplification of bacterial 16S rRNA gene segments are generally non-degenerate because of the nearly perfect conservation of the annealing sites among bacteria and therefore one would not expect annealing bias to be a factor in these studies. However, the PCR primers used to amplify the 549-567 bp “universal target” region of the gene (H279 and H280)(Table 1) are degenerate and contain inosine residues in some positions to minimize degeneracy (Goh et al., 1996). There is an approximately 100-fold difference in the thermodynamic stability of inosine with each of the four nucleotides with I:C>I:A>I:G>I:T (Martin et al., 1985; Kawase et al., 1986). Stacking interactions between sequential inosine residues and variations in local structure of the DNA duplex can also affect the efficiency of base-pairing and introduce annealing bias when inosine-containing primers are used in PCR. We routinely use cpn60 amplification and sequencing in our laboratory for the identification of bacterial isolates and have observed that high G+C templates are more problematic and recalcitrant to amplification with the H279 and H280 primer pair. In fact, an analysis of approximately 1500 cpn60 universal target sequences generated in our laboratory (G+C content from 29% to 71%), led to the observation that the problematic templates were those with at least 58% G+C content.
We attempted to alleviate this problem by using the known fully degenerate versions of primers H279 and H280, containing N (A, C, T or G) in all of the positions currently occupied by inosine residues. Primers similar to this have been successfully applied to the amplification of Bifidobacterium cpn60 sequences (Jian et al., 2001), however we found that when applied to most templates, especially complex templates containing a mixture of genomes, these primers yielded unacceptable levels of inappropriate PCR products.