(a) Field of the Invention
This invention relates to splice site selection, a process required for the generation of mRNAs encoding different proteins.
(b) Description of Prior Art
The completion of genome sequencing efforts for the Drosophila, the mouse and the human genomes has led to the conclusion that complex organisms have a smaller than expected set of protein-coding genes. In contrast, the full complement of proteins found in complex animals is much more diverse. While post-translational modifications probably account for a good fraction of protein diversity, the principal mechanism used to generate protein diversification is likely due to alternative pre-mRNA splicing mechanisms which act post-transcriptionally (Maniatis, T. and Tasic, B, (2002) Nature 418:236, Black, D. L., (2003), Annu. Rev. Biochem. 72:291-336).
Recent estimates based on analyses of Expressed Sequences Tags (ESTs) corresponding to mRNAs predict that at least 35% of all human genes are alternatively spliced. Given that ESTs only cover a portion of the mRNA transcript, often corresponding to the non-coding 3′ end of the mRNA, this number is likely to be an underestimate. A recent analysis of chromosome 22 estimates the number of genes expressed that are alternatively spliced to be on the order of 59%.
Eukaryotic mRNAs are transcribed as precursors, or pre-mRNAs, which contain intronic sequences. These intronic sequences are excised and the exons are spliced together to form mature mRNA. The basic biochemical reactions involved in splicing are relatively well-known. A transcribed pre-mRNA contains a 5′ exon-intron junction, or splice site, which is marked by the consensus sequence CAG/GTAAGT (where/is the exon-intron junction); a 3′ splice site marked by the consensus sequence YnCAG/(Y=Pyrimidines and n=3 to 12); a branchpoint about 25-100 nucleotides upstream of the 3′ splice site; and a polypyrimidine track. The splicing event itself requires the binding of several RNA binding proteins and ribonucleoprotein particules (e.g. snRNPs) to form the spliceosome. After spliceosome assembly, two transesterification reactions follow which result in the fusion of the two exon sequences and the release of the lariat-shaped intron.
Given the number of introns and the potential splice sites within a given gene, alternative splicing can produce a variety of mRNA products from one pre-mRNA molecule. The consequences of alternative splicing range from controlling protein expression, by excluding and including stop codons, to allowing for the diversification of protein products. Alternative splicing has an extremely important role in expanding the protein repertoire of any given species by allowing for more transcripts and therefore protein products from a single gene.
While genes that contain a single alternative splicing unit can produce two spliced isoforms, it is not uncommon for genes hosting multiple alternative splicing units to generate ten or more distinct mRNAs. For example, the alternative splicing of troponin T and CD44 pre-mRNAs can generate 64 and more than 2000 isoforms, respectively. The most striking example to date is the splicing of the Drosophila gene that codes for DSCAM, a protein involved in axon guidance. Due to 95 different exons distributed in four alternatively spliced regions, a single DSCAM gene has the potential to generate 38,016 different DSCAM proteins, a number which is three times the total number of genes in Drosophila. If we assume a conservative average of five isoforms per alternatively spliced gene, the identity of more than 85% of the whole collection of human proteins would be determined by alternative pre-mRNA splicing.
Although alternative pre-mRNA splicing is a powerful contributor to protein diversity in mammals, relatively little is known about the identity of modulating factors and the underlying molecular mechanisms that control splice site selection. Recent progress has identified a variety of non-splice site elements that can positively or negatively affect splice site recognition. In addition, splicing enhancers, RNA binding proteins, and silencer elements have also been shown to play a role in the natural regulation of alternative splicing.
The effect of alternatively including or excluding exons, portions of exons, or introns, can have a broad range of effects on the structure and activity of proteins. In some transcripts, whole functional domains (e.g., DNA binding domain, transcription-activating domain, membrane-anchoring domain, localization domain) can be added or removed by alternative splicing. In other examples, the inclusion of an exon carrying a stop codon can yield a shortened and sometimes inactive protein. In other systems, the introduction of an early stop codon can result in a truncated protein, transforming a membrane bound protein into a soluble protein, for example, or an unstable mRNA. The differential use of splice sites is often regulated in a developmental, cellular, tissue, and sex-specific manner. The functional impact of alternative splicing in a variety of cellular processes including neuronal connectivity, electrical tuning in hair cells, tumor-progression, apoptosis, and signaling events, is just starting to be documented.
Perturbations in alternative splicing have been associated with human genetic diseases and cancer. There are many examples of cancers where an alternatively spliced isoform of a protein has increased ligand affinity or loss of tumor suppressor activity which contributes to neoplastic growth. For example, the inappropriate inclusion of exons in BIN1 mRNA results in the loss of tumor suppressor activity.
Also of particular interest is the contribution of alternative splicing to the control of apoptosis, or programmed cell death. Overexpression of anti-apoptotic proteins (e.g., Bcl-2, Bcl-xL, Bcl-w, Mcl-1) or blocking the expression of pro-apoptotic proteins (e.g., Bax, Bim, Bcl-xS, Bcl-G) protects cells against death stimuli. In contrast, preventing the expression of anti-apoptotic forms promotes or sensitizes cells to death stimuli, a situation also observed by overexpressing pro-apoptotic Bcl-2 family members. Thus, apoptotic pathways are controlled via a delicate balance between pro- and anti-apoptotic activities and alternative splicing is one mechanism used for careful regulation of the cellular response to death signals.
In a number of cancers and cancer cell lines, the ratio of the splice variants is frequently shifted to favor production of the anti-apoptotic form. For example, overexpression of Bcl-xL is associated with decreased apoptosis in tumors, resistance to chemotherapeutic drugs, and poor clinical outcome. Given that many genes are alternatively spliced to produce proteins with opposing effects on apoptosis, perturbations that would shift alternative splicing toward the pro-apoptotic forms may help reverse the malignant phenotype of cancer cells. Thus, the ability to shift splice site selection in favor of pro-apoptotic variants could become a valuable anti-cancer strategy.
Because alternative splicing controls the production and activity of many types of proteins implicated in a variety of pathways, reprogramming splice site selection by preventing the use of one site to the benefit of another competing site would enable the manipulation of protein production and protein function in a general manner. Every aspect of the life of a cell, a tissue or an organism could therefore be affected by methods that block or influence the use of specific splice sites. Alternative splicing has been documented for kinases, transcription factors, trans-membrane protein and receptors, nucleic-acid binding proteins, metabolic enzymes, secreted proteins, extracellular matrix proteins, as well as other proteins. Accordingly, reprogramming the alternative splicing of any of these proteins has the potential to affect the function of each of these proteins.
Given the pivotal role that alternative splicing plays in the diversification of protein function, strategies capable of controlling or reprogramming splice site selection could have an immense impact on our ability to address the function of individual isoforms, as well as providing novel and specific tools to modify or reprogram cellular processes. Approaches that target alternative splicing could therefore provide specific ways to modulate the expression of spliced isoforms with distinct activities. In addition to treating cancer, splicing interference strategies have potential therapeutic values in a wide range of genetic diseases that are caused by point mutations affecting splice site selection. In fact, 15% of all genetic defects (e.g., thalassemia, haemophilia, retinoblastoma, cystic fibrosis, analbuminemia, Lesch-Nyhan syndrome) are caused by splice site mutations.
It is clear that there remains a need for effective methods for controlling or reprogramming splice site selection. Such strategies could have an immense impact on our ability to address the function of individual protein isoforms, as well as providing novel and specific tools to modify or reprogram cellular processes such as apoptosis for the treatment of human disease.
Exons represent approximately 1% of the human genome and range in size from 1 to 1000 nt, with an mean size for internal exons of 145 nt. In contrast, introns constitute 24% of our genome with sizes ranging from 60 to more than 200 000 nt. The mean size of human introns is more than 3, 300 nt and nearly 20% of human introns are longer than 5 Kb. The efficient and accurate removal of introns is crucial for the production of functional mRNAs. For long introns, it is easy to envision the difficulties associated with finding and committing a pair of splice sites when such sites are separated by several thousands of nucleotides. The presence of intronic sequences that resemble splicing signals may also promote a multitude of weaker and non-productive interactions that will decrease the pairing efficiency of correct splice sites. Finally, the long distance separating these splicing partners means that they will be synthesized at different times. Consequently, the 5′ splice site must remain available until the authentic 3′ splice site has been synthesized. These potential problems may explain why short introns are more prevalent in highly expressed genes. Understanding how the removal of long introns occurs efficiently and accurately remains a tremendous challenge for which very little experimental work has been accomplished. In Drosophila, the removal of a 74 kb-long intron in the Ultrabithorax gene has been shown to occur by successive steps, each one regenerating a 5′ splice site which is used in the next step until complete intron removal has been carried out. In mammals, intron size can influence alternative splicing (Bell, M. V., et al., (1998), Mol. Cell. Biol. 18:5930-5941) but the mechanisms that enforce the efficient removal of long introns have not yet been investigated.
Some of the decisions associated with the removal of long introns are similar to the choices made by the splicing machinery during the selection of alternative splice sites. Selecting the appropriate pair of splice sites in alternative splicing units requires the contribution of many types of elements that are recognized by different classes of proteins including SR and hnRNP proteins. hnRNP A1 was the first protein of its class being attributed a function in splice site selection based on its ability to antagonize the activity of the SR protein SF2/ASF in a 5′ splice site selection assay. A role for the hnRNP A/B proteins in the alternative splicing of many mammalian and viral pre-mRNAs has now been documented (Chabot et al., (2003), Regulation of alternative splicing, Springer-Verlag Gmby & Co., Heidelberg, vol 31, pp. 59-88).
It would be highly desirable to be provided with methods of modulating splice site selection.