In a large variety of eukaryotic species the largest subunit of nuclear RNA polymerase II (RPII) contains a region known as the C-terminal domain ("CTD"). The CTD of human beings and other mammals such as mice consists of 52 repeats of the consensus heptamer Tyr-Ser-Pro-Thr-Ser-Pro-Ser, while the CTDs of most lower eukaryotes consist of fewer repeats of the same consensus sequence. The CTD of the yeast Saccharomyces cerevisiae, for example, contains 26 repeats of this heptamer, the CTD of the fruit fly Drosophila contains 45 repeats, and the malarial parasite Plasmodium falciparum contains 17 repeats. The repeating heptamers may not match the consensus sequence exactly, for example, in Saccharomyces cerevisiae 17 of the 26 repeats exactly match the consensus heptamer Tyr-Ser-Pro-Thr-Ser-Pro-Ser, while in the CTD of Drosophila, only two of the 45 repeats are exact matches. A CTD region is not found in the homologous subunits of RNA polymerases I or III, or in the prokaryotic .beta.' subunit.
While the repetitive CTD domain is conserved among a wide range of eukaryotic organisms, some eukaryotic RNA polymerase II contains a carboxy-terminus extension (CTE) rather than a CTD region. For example, the largest subunit of Trypanosoma brucei RNA polymerase II has a carboxy-terminus extension (CTE) consisting of 228 amino acids which is rich in serine and proline.
The CTD is essential for viability, as yeast or mouse cells containing RNA polymerase II from which all or most of the repeats have been removed do not grow. A notable feature of the CTD is that it is subject to hyperphosphorylation. A consequence of hyperphosphorylation is that the mobility in SDS gels of the largest RNA polymerase II subunit is markedly reduced. The mobility-shifted, hyperphosphorylated largest subunit is referred to as IIo, whereas the unphosphorylated subunit is referred to as IIa.