MicroRNAs —Novel Regulators of Gene Expression
MicroRNAs (miRNAs) are an abundant class of short endogenous RNAs that act as post-transcriptional regulators of gene expression by base-pairing with their target mRNAs. The ˜22 nucleotide (nt) mature miRNAs are processed sequentially from longer hairpin transcripts (primary miRNA/pri-miRNA or precursor miRNA) by the RNAse III ribonucleases Drosha (Lee et al. 2003) and Dicer (Hutvagner et al. 2001, Ketting et al. 2001). To date more than 3400 miRNAs have been annotated in vertebrates, invertebrates and plants according to the miRBase microRNA database re-lease 7.1 in October 2005 (Griffith-Jones 2004, Griffith-Jones et al. 2006), and many miRNAs that correspond to putative miRNA genes have also been bioinformatically predicted. More than half of all known mammalian miRNAs are hosted within the introns of pre-mRNAs or long ncRNA transcripts (Rodriquez et al. 2004). Many miRNA genes are arranged in genomic clusters (Lagos-Quintana et al. 2001). For example, ca. 40% of human miRNA genes appear in clusters of two or more, with the largest cluster of 40 miRNA genes being located in the human imprinted 14q32 domain (Setiz et al. 2004; Altuvia et al. 2005). In plants, 117 miRNA genes have been identified in Arabidopsis thaliana while number of miRNAs identified in rice is currently 178 (Griffith-Jones 2004, Griffith-Jones et al. 2006). The identified miRNAs to date represent most likely the tip of the iceberg, and the number of miRNAs might turn out to be very large. Recent bioinformatic predictions combined with array analyses, small RNA cloning and Northern blot validation indicate that the total number of miRNAs in vertebrate genomes is significantly higher than previously estimated and may be as many as 1000 (Bentwich et al. 2005, Berezikov et al. 2005, Xie et al. 2005).
The first miRNAs genes to be discovered, lin-4 and let-7, base-pair incompletely to repeated elements in the 3′ untranslated regions (UTRs) of heterochronic genes, and control developmental timing in C. elegans by regulating translation directly and negatively via antisense RNA-RNA interaction (Lee et al. 1993, Reinhart et al. 2000). The majority of plant miRNAs have perfect or near-perfect complementarity with their target sites and direct RISC-mediated target mRNA cleavage (for review, see Bartel 2004). A large fraction of the plant miRNAs appear to regulate genes with roles in developmental processes, such as control of meristem identity, cell proliferation, developmental timing and patterning (Kidner and Martienssen 2005). In contrast, most animal miRNAs recognise their target sites located in 3′-UTRs by incomplete base-pairing, resulting in translational repression of the target genes (Bartel 2004). An increasing body of research shows that animal miRNAs play fundamental biological roles in cell growth and apoptosis (Brennecke et al. 2003), hematopoietic lineage differentiation (Chen et al. 2004), life-span regulation (Boehm and Slack 2005), photoreceptor differentiation (Li and Carthew 2005), homeobox gene regulation (Yekta et al. 2004, Hornstein et al. 2005), neuronal asymmetry (Johnston et al. 2004), insulin secretion (Poy et al. 2004), brain morphogenesis (Giraldez et al. 2005), muscle proliferation and differentiation (Chen, Mandel et al. 2005, Kwon et al. 2005, Sokol and Ambros 2005), cardiogenesis (Zhao et al. 2005) and late embryonic development in vertebrates (Wienholds et al. 2005). Several studies have identified sub-classes of miRNAs directly implicated in the regulation of mammalian brain development and neuronal differentiation (Krichevsky et al. 2003, Miska et al. 2004, Sempere et al. 2004, Smirnova et al. 2005). Interestingly, many neural miRNAs appear to be temporally regulated in cortical cultures copurifying with polyribosomes, suggesting that they may control localized translation of dendrite-specific mRNAs (Kim et al. 2004).
The number of regulatory mRNA targets of vertebrate miRNAs has been estimated by identifying conserved complementarity to the miRNA seed sequences (nucleotide 2-7 of the miRNA), suggesting that ˜30% of the human genes may be miRNA targets (Lewis et al. 2005). Computational predictions in Drosophila provide evidence that a given miRNA has on average ˜100 mRNA target sites in the fly, while another recent study reported that vertebrate miRNAs may target ˜200 mRNAs each, further supporting the notion that miRNAs can regulate the expression of a large fraction of the protein-coding genes in multicellular eukaryotes (Brennecke et al. 2005, Krek et al. 2005). Most recent reports indicate that miRNAs may not function as developmental switches, but rather play a role in maintaining tissue identity by conferring accuracy to gene-expression programs (Giraldez et al. 2005, Lim et al. 2005, Stark et al. 2005, Farh et al. 2005, Wienholds et al. 2005).
MicroRNAs in Human Disease
The expanding inventory of human miRNAs along with their highly diverse expression patterns and high number of potential target mRNAs suggest that miRNAs are involved in a wide variety of human diseases. One is spinal muscular atrophy (SMA), a paediatric neurodegenerative disease caused by reduced protein levels or loss-of-function mutations of the survival of motor neurons (SMN) gene (Paushkin et al. 2002). A mutation in the target site of miR-189 in the human SLITRK1 gene was recently shown to be associated with Tourette's syndrome (Abelson et al. 2005), while another recent study reported that the hepatitis C virus (HCV) RNA genome interacts with a host-cell miRNA, the liver-specific miR-122a, to facilitate its replication in the host (Jopling et al. 2005). Other diseases in which miRNAs or their processing machinery have been implicated, include fragile X mental retardation (FXMR) caused by absence of the fragile X mental retardation protein (FMRP) (Nelson et al. 2003, Jin et al. 2004) and DiGeorge syndrome (Landthaler et al. 2004). In addition, perturbed miRNA expression patterns have been reported in many human cancers. For example, the human miRNA genes miR-15a and miR-16-1 are deleted or down-regulated in the majority of B-cell chronic lymphocytic leukemia (CLL) cases, where a unique signature of 13 miRNA genes was recently shown to associate with prognosis and progression (Calin et al. 2002, Calin et al. 2005). The role of miRNAs in cancer is further supported by the fact that more than 50% of the human miRNA genes are located in cancer-associated genomic regions or at fragile sites (Calin et al. 2004). Recently, systematic expression analysis of a diversity of human cancers revealed a general down-regulation of miRNAs in tumours compared to normal tissues (Lu et al. 2005). Interestingly, miRNA-based classification of poorly differentiated tumours was successful, whereas mRNA profiles were highly inaccurate when applied to the same samples. miRNAs have also been shown to be deregulated in lung cancer (Johnson et al. 2005) and colon cancer (Michael et al. 2004), while the miR-17˜92 cluster, which is amplified in human B-cell lymphomas and miR-155 which is upregulated in Burkitt's lymphoma have been reported as the first human miRNA oncogenes (E is et al. 2005, He et al. 2005). Thus, human miRNAs may not only be highly useful as biomarkers for future cancer diagnostics, but are rapidly emerging as attractive targets for disease intervention by antisense oligonucleotide technologies.
Human Breast Cancer
Breast cancer is one of the most common cancers of women; it is a complex, inadequately understood, and often a fatal disease. Studies in many laboratories over the past few decades have demonstrated that breast cancer (and indeed cancer in general) results from a series of mutations affecting multiple classes of genes. Natural selection favours the growth of cells containing mutations that confer growth advantages and prevent the functioning of normal growth inhibitory mechanisms such as apoptosis. Mutations affecting both proto-oncogenes and tumour suppressor genes contribute to breast cancer and affect diverse cellular processes including signal transduction, DNA replication and repair, transcription, translation, apoptosis and differentiation. Factors known to be important for characterization, diagnosis and prognosis of breast tumours include the status of the estrogen receptor (ER), epithelial growth factor receptor (EGFR), human EGF receptor 2 (HER2) and p53, and some of these can be used as targets for therapeutic intervention (Colozza et al., 2005). From a clinical perspective, ER and HER2 are considered the main molecular targets, since effective drugs exist to treat these tumour types: tamoxifen and aromatase inhibitors for ER+ tumours and Herceptin for HER2-overexpressing tumours, respectively.
Human Lung Cancer
Lung cancer is the leading cause of cancer deaths in both women and men in the United States and throughout the world. Lung cancer is the number one cause of cancer deaths in men and has surpassed breast cancer as the leading cause of cancer deaths in women. In the United States in 2004, 160,440 people were projected to die from lung cancer compared with a projected 127,210 deaths from colorectal, breast, and prostate cancer combined. Only about 14% of all people who develop lung cancer survive for 5 years.
The lungs are a common site for metastasis, i.e. spreading of tumours to nearby lymph nodes or to other organs via the blood system. Lung cancers are usually divided into 2 groups accounting for about 95% of all cases. The division is based on the type of cells that comprise the cancer. The 2 types of lung cancer are classified based on the cell size of the tumour. They are called small cell lung cancer (SCLC) and non-small cell lung cancer (NSCLC) where the latter includes several types of tumours. SCLC is less common, however they grow more rapid and are more likely to metastasize than NSCLCs. SCLCs have often already spread to other parts of the body when the disease is diagnosed thus it is of high importance to detect lung cancer at an early stage using efficient markers.
About 5% of lung cancers are of rare cell types, such as carcinoid tumour, lymphoma, or metastatic (cancers from other parts of the body that spread to the lungs).
The specific types of primary lung cancers are Adenocarcinoma (an NSCLC) which is the most common type of lung cancer, making up 30-40% of all cases. A subtype of adenocarcinoma is called bronchoalveolar cell carcinoma, which creates a pneumonia-like appearance on chest x-ray films. Squamous cell carcinoma (an NSCLC) is the second most common type of lung cancer, making up about 30% of all lung cancers while large cell cancer makes up 10% and SCLC 20% of all cases and carcinoid lung cancer accounts for 1% of all cases.
Cancer Diagnosis and Identification of Tumour Origin
Cancer classification relies on the subjective interpretation of both clinical and histo-pathological information by eye with the aim of classifying tumours in generally accepted categories based on the tissue of origin of the tumour. However, clinical information can be incomplete or misleading. In addition, there is a wide spectrum in cancer morphology and many tumours are atypical or lack morphologic features that are useful for differential diagnosis. These difficulties may result in diagnostic confusion, with the need for mandatory second opinions in all surgical pathology cases (Tomaszewski and LiVolsi 1999, Cancer 86: 2198-2200).
Another problem for cancer diagnostics is the identification of tumour origin for metastatic carcinomas. For example, in the United States, 51,000 patients (4% of all new cancer cases) present annually with metastases arising from occult primary carcinomas of unknown origin (ACS Cancer Facts & Figures 2001: American Cancer Society). Adenocarcinomas represent the most common metastatic tumours of unknown primary site. Although these patients often present at a late stage, the outcome can be positively affected by accurate diagnoses followed by appropriate therapeutic regimens specific to different types of adenocarcinoma (Hillen 2000, Postgrad. Med. J. 76: 690-693). The lack of unique microscopic appearance of the different types of adenocarcinomas challenges morphological diagnosis of adenocarcinomas of unknown origin. The application of tumour-specific serum markers in identifying cancer type could be feasible, but such markers are not available at present (Milovic et al. 2002, Med. Sci. Monit. 8: MT25-MT30). Microarray expression profiling has been used to successfully classify tumours according to their site of origin (Ramaswamy et al. 2001, Proc. Natl. Acad. Sci.U.S.A. 98: 15149-15154), but the lack of a standard for array data collection and analysis make them difficult to use in a clinical setting. SAGE (serial analysis of gene expression), on the other hand, measures absolute expression levels through a tag counting approach, allowing data to be obtained and compared from different samples. The drawback of this method is, however, its low throughput, making it inappropriate for routine clinical applications. Quantitative real-time PCR is a reliable method for assessing gene expression levels from relatively small amounts of tissue (Bustin 2002, J. Mol. Endocrinol. 29: 23-39). Although this approach has recently been successfully applied to the molecular classification of breast tumours into prognostic subgroups based on the analysis of 2,400 genes (Iwao et al. 2002, Hum. Mol. Genet. 11: 199-206), the measurement of such a large number of randomly selected genes by PCR is still not clinically practical.
Another limitation to further study and identify potential biomarkers is the difficulty of conducting retrospective studies with archived tumour samples (Ludwig and Weinstein, 2005). To be useful for subsequent studies, tumour samples obtained from the operating room must be transported to the pathology laboratory in a timely manner, where the samples are sectioned and stored as frozen or formalin-fixed and paraffin-embedded (FFPE) for archiving in a tumour bank. The quality of the RNA of frozen and FFPE samples is often compromised and is, thus, unsuitable for conducting accurate molecular tests. Hence, the small size of miRNAs offers a unique advantage due to the fact that these short RNA molecules are more stable and less prone to enzymatic degradation by RNAses, and are therefore amenable to an accurate assessment of miRNA levels in archival tumour samples.