The translation of genetic information into protein depends on RNA as a means for storing and decoding DNA polynucleotide sequences. The first step in this process is the transcription of DNA into RNA which is chemically similar to DNA and retains all the genetic information encoded in DNA. The RNA transcript undergoes various processing steps which include splicing and polyadenylation. The mature RNA transcript, called messenger RNA (mRNA), is translated into protein by the ribosomal machinery.
Nascent RNA transcripts are spliced in the nucleus by the spliceosomal complex which catalyzes the removal of introns and the rejoining of exons. The spliceosomal complex is comprised of five small nuclear ribonucleoprotein particles (snRNPs) designated U1, U2, U4, U5, and U6. Each snRNP contains a single species of RNA and about 10 proteins. The RNA components of some snRNPs recognize and base pair with intron consensus sequences. The protein components mediate spliceosome assembly and the splicing reaction. snRNP proteins and other nuclear RNA binding proteins are generally referred to as RNPs and are characterized by an RNA recognition motif (RRM). (Reviewed in Birney, E. et al. (1993) Nucleic Acids Res. 21:5803-5816.) The RRM is about 80 amino acids in length and forms four .beta.-strands and two .alpha.-helices arranged in an .alpha./.beta. sandwich. The RRM contains a core RNP-1 octapeptide motif along with surrounding conserved sequences. In addition to snRNP proteins, examples of RNA-binding proteins which contain the above motifs include heteronuclear ribonucleoproteins which stabilize nascent RNA and factors which regulate alternative splicing. Alternative splicing factors include developmentally regulated proteins which have been identified in lower eukaryotes such as Drosophila melanogaster and Caenorhabditis elegans. These proteins play key roles in developmental processes such as pattern formation and sex determination, respectively. (See, for example, Hodgkin, J. et al. (1994) Development 120:3681-3689.)
Although most RNPs contain an RRM or RNP-1 motif, there are exceptions. The A' polypeptide is a unique component of the U2 snRNP that does not contain these motifs (Sillekens, P. T. et al. (1989) Nucleic Acids Res. 17:1893-1906). A' is 255 amino acids in length with a predicted molecular weight of 28,444 daltons. Notable features of A' include a leucine-rich amino-terminal half and an extremely hydrophilic carboxy-terminal half. The latter region may be involved in RNA binding, while the former region may mediate protein-protein interactions.
In addition to splicing, aspects of RNA metabolism include alteration and regulation of RNA conformation and secondary structure. These processes are mediated by RNA helicases which utilize energy derived from ATP hydrolysis to destabilize and unwind RNA duplexes. The most well-characterized and ubiquitous family of RNA helicases is the DEAD-box family, so named for the conserved B-type ATP-binding motif which is diagnostic of proteins in this family. Over 40 DEAD-box helicases have been identified in organisms as diverse as bacteria, insects, yeast, amphibians, mammals, and plants. For example, a recent addition to the DEAD-box family is the RNA helicase encoded by the Drosophila hlc gene (Maleszka, R. et al. (1998) Proc. Natl. Acad. Sci. USA 95:3731-3736). DEAD-box helicases function in diverse processes such as translation initiation, splicing, ribosome assembly, and RNA editing, transport, and stability. Some DEAD-box helicases play tissue- and stage-specific roles in spermatogenesis and embryogenesis. All DEAD-box helicases contain several conserved sequence motifs spread out over about 420 amino acids. These motifs include an A-type ATP binding motif, the DEAD-box/B-type ATP-binding motif, a serine/arginine/threonine tripeptide of unknown function, and a C-terminal glycine-rich motif with a possible role in substrate binding and unwinding. In addition, alignment of divergent DEAD-box helicase sequences has shown that 37 amino acid residues are identical among these sequences, suggesting that conservation of these residues is important for helicase function. (Reviewed in Linder, P. et al. (1989) Nature 337:121-122.)
Overexpression of the DEAD-box 1 protein (DDX1) may play a role in the progression of neuroblastoma (Nb) and retinoblastoma (Rb) tumors (Godbout, R. et al. (1998) J. Biol. Chem. 273:21161-21168). Nb and Rb tumor progression is promoted by the amplification of the proto-oncogene encoding MYCN, a transcription factor. However, amplification of both the MYCN gene and the DDX1 gene, which maps in proximity to the MYCN gene on chromosome 2, is correlated with significantly higher rates of tumor progression. Amplification of the DDX1 gene results in increased levels of DDX1 RNA and protein, the latter being aberrantly localized in Nb and Rb cells. These observations suggest that DDX1 may promote or enhance tumor progression by altering the normal secondary structure and expression levels of RNA in cancer cells. In addition, cancer cells that have amplified both DDX1 and MYCN genes may have a selective advantage over cancer cells that have amplified only the MYCN gene.
Other DEAD-box helicases have been implicated either directly or indirectly in tumorigenesis. (Discussed in Godbout, supra.) For example, murine p68 is mutated in ultraviolet light-induced tumors, and human DDX6 is located at a chromosomal breakpoint associated with B-cell lymphoma. Similarly, a chimeric protein comprised of DDX10 and NUP98, a nucleoporin protein, may be involved in the pathogenesis of certain myeloid malignancies.
The discovery of new human RNA binding proteins and the polynucleotides encoding them satisfies a need in the art by providing new compositions which are useful in the diagnosis, prevention, and treatment of cancer, immune disorders, and developmental disorders.