The interface between genetics and biochemistry sits at the level of translation wherein mRNAs are selected from a large pool of competing transcripts and fed into ribosomes for protein synthesis. Choosing which mRNAs enter a translation system out of the thousands of competing substrates is a highly-orchestrated process that requires the activities of factors that recognize sequence determinants in messages during translation initiation (Malys and McCarthy, 2011). Aside from the transcriptional regulation that synthesizes the RNA pool, important complementary control systems rapidly sculpt transcriptomes by promoting RNA turnover (Arraiano et al., 2010; Burger et al., 2011). Thus, transcribed genetic information flows from the genome and partitions either into a decoding event or into a decay event. Once a message has been translated, a partitioning decision is again made, so the entire proteome is a reflection of a single molecular event that decides the fate of mRNAs.
Of the ˜4,400 Escherichia coli (E. coli) open reading frames, roughly a third have not had their encoded protein functions experimentally verified (1-6). This collection of genes creates a significant black box in our understanding of fundamental cellular physiology, especially when considering those genes of unknown function that are essential for viability.
Much of what is known about gene function has followed from studies in bacteria where there are suites of powerful tools available that have been refined in model systems such as E. coli. Despite astounding advances in DNA sequencing and synthesis technologies, there is a remaining fundamental question that has piqued the interest of hundreds of investigators: what are the minimal requirements for life? The predominant approaches to answering this question have been to either computationally compare genomes and identify conserved core genes or to randomly disrupt the genomes using transposons and deduce which genes do not tolerate interruption this is often referred to as “genetic footprinting” (1,2,14-16). These combined strategies form the cornerstone of our understanding of what comprises an essential genome.
Numerous bacterial genera have had their genomes interrogated both computationally and experimentally in an effort to reveal what the most important features are in a genome. These include early comparisons of Mycoplasma and Haemophilus (14-18), followed by Bacillus (19,20), Mycobacterium (21,22), Pseudomonas (23,24), Helicobacter (25), Salmonella (26), and Escherichia (1,2,27). Using genetic tools to directly test the importance of protein-encoding genes, Baba et al. determined that approximately 300 ORFs appear to be essential for E. coli growth (1). The recent additions of open reading frames that were not annotated at the time the footprinting and recombineering studies were performed may be among the most dramatic changes to the list of E. coli genes. Many of these ORFs (˜60) are very small (<50 amino acids) yet clearly play a role in cellular physiology (3,4) Likewise, other genes (˜100) that encode small RNAs and appear to be important are also new additions (4,28,29). Thus, even the assignment of how many genes E. coli has is changing on a regular basis.
In bacteria, orthologs of the ribosomal protein S1 are the gatekeepers that shuttle mRNA into the translation pool by promoting associations with the small subunit during translation initiation. Of these, the S1 protein of Escherichia coli is the best characterized (Delvillani et al., 2011; Feng et al., 2001; Moll et al., 2002a; Nakagawa et al., 2010; Subramanian, 1983; Suryanarayana and Subramanian, 1983). Despite being the largest ribosomal protein in E. coli, it is a weakly-associated factor that may cycle on and off ribosomes during the translation cycle (Culver and Noller, 1999; Held et al., 1973; Subramanian, 1983). S1 is comprised of multiple RNA binding domains and tethers mRNAs to the small subunit via an interaction that requires the presence of ribosomal protein S2 (Boileau et al., 1981; Bollen et al., 1979; Moll et al., 2002a). An added feature of E. coli's S1 is that it also directly interacts with RNase E and PNPase, each components of the degradosome, which is responsible for bulk RNA turnover (Arraiano et al., 2010; Burger et al., 2011; Feng et al., 2001; Prud'homme-Généreux et al., 2004; Py et al., 1996).