The spatial and temporal coordination of gene expression during embryogenesis involves a variety of regulatory mechanisms, of which those acting at the transcriptional level have been most intensively studied (Davidson et al., 1998; Gellon and McGinnis, 1998; Gray and Levine, 1996; Mannervik et al., 1999). Less is known about mechanisms that control differential production and accumulation of specific proteins at various sites in the developing embryo at the post-transcriptional level, causing the RNA transcript to be spliced appropriately, or regulating transport of the spliced mRNA to the cytoplasm. Recent interest in the role of differential splicing in development and the factors and mechanisms by which it is accomplished (Chabot, 1996; Lopez, 1998), and a growing understanding of the determinants of nucleo-cytoplasmic transport of particular mRNAs (Piñol-Roma and Dreyfuss, 1992; Siomi and Dreyfuss, 1997; Weis, 1998), has set the stage for a systematic analysis of how these RNA processing factors contribute to regional and cell type specificity in the embryo.
The ribonucleoprotein hnRNP A1 is of particular interest in this regard, as it functions in both RNA splice site selection and nucleus-to-cytoplasm transport of mRNA. In its capacity as a splicing factor, this protein modulates 5′ splice site selection in a group of gene products, some of which contain a well-characterized RNA sequence determinant (Burd and Dreyfuss, 1994). Among these are the pre-mRNAs of the HIV type 1 tat protein (Del Gatto-Konczak et al., 1999), FGF receptor 2 (FGFR2) (Del Gatto-Konczak et al., 1999), and hnRNP A1 itself (Chabot et al., 1997). In its role in nucleus-to-cytoplasm transport, hnRNP A1 acts as a “shuttle” protein (Piñol-Roma and Dreyfuss, 1992), and is characterized by a novel amino acid motif termed M9, which contains both nuclear localization and nuclear export activities (Michael et al., 1995; Siomi and Dreyfuss, 1995).
The known functions of hnRNP A1 as an RNA shuttle protein and in splice choice selection are exerted in a gene product-specific fashion (Dreyfuss et al., 1993). The tissue-restricted spatiotemporal patterns in the protein's expression reported are therefore likely to be a causal component of the process by which cell types become distinctive from one another during organogenesis. A subclass of primary transcripts (including hnRNP A1 itself, Chabot et al., 1997; Mayeda et al., 1994) is differentially spliced by a process that depends on hnRNP A1. Control of splice choice appears to involve the antagonism of constitutive splicing factors such as SF2/ASF by members of the hnRNP A/B family of proteins (Mayeda and Krainer, 1992, Mayeda et al., 1993, 1994; Del Gatto-Konczak et al., 1999). Once spliced, these RNAs are transported into the cytoplasm by a process that involves hnRNP A1 and transportin 1 (Nakielny and Dreyfuss, 1998, Nakielny et al., 1999). This implies that the regulation of hnRNP A1 levels within living cells during development plays a key role in cell type diversification.
Earlier studies have surveyed the distribution of hnRNP A1 in a limited set of adult cell types (Kamma et al., 1995; Faura et al., 1995) including the developing germ cells of postnatal mice. A study by the inventors and their colleagues documenting the characterization of the sequence of chicken hnRNP A1 and its spatiotemporal and organ-specific expression during embryogenesis is hereby incorporated by reference in its entirety. (Bronstein et al., 2001). hnRNP A1 protein is abundantly expressed in early stage epithelia such as skin, extraembryonic membranes, and neuroectoderm; epithelioid tissues, such as liver; as well as “secondary” epithelia and epithelioid tissues derived from mesenchymes, e.g., heart muscle, skeletal muscle, kidney tubules, sinusoidal vascular endothelium, and precartilage condensations. It is not clear, however whether this pattern represents an authentic expression theme, or simply the prevalence both of epithelioid tissues in the early embryo and of hnRNP A1 expression. (Bronstein et al., 2001).
The expression of hnRNP A1 in differentiating neuroectoderm and dorsal root ganglia broadly coincides with patterns of expression of members of the Hu class of RNA binding protein genes in the chicken (Wakamatsu and Weston, 1997) and it is significant in this regard that the Hu family of proteins results from extensive alternative splicing of Hu gene products during neurogenesis (Okano and Darnell, 1997). However, while the expression of the two RNA binding proteins may be partly overlapping, they are not entirely so: the typical DRG cell nuclei expressing hnRNP A1 are larger than those expressing Hu and vertebral body cartilage expresses hnRNP A1 but not Hu.
The transcription of natural antisense RNA cognate to exonic sequences of the hnRNP A1 gene in many of the same tissues that are producing sense transcript is an unusual phenomenon, but one that is not as rare as previously thought (Dolnick, 1997; Vanhee-Brossollet and Vaquero, 1998). Antisense RNA probably functions as a post-transcriptional inhibitor of gene expression (Knee and Murphy, 1997). This regulatory mechanism may be particularly relevant during development—natural antisense transcripts of several developmentally active growth factors—fibroblast growth factor-2 (Savage and Fallon, 1995), bone morphogenetic protein-2 (Feng et al., 1997), and transforming growth factor-2 (Coker et al., 1998)—have been detected at significant levels, the first two in embryonic tissues. There is one previous report of differential expression of natural antisense RNA expression of a splicing factor gene (Sureau et al., 1997). Moreover, since hnRNP A1 can promote RNA-RNA strand annealing (Cobianchi et al., 1993; Idriss et al., 1994) it is itself a potential component of natural antisense regulatory mechanisms (Oberosler and Nellen, 1997).
The organ- and tissue-specific sense and antisense hnRNP A1 RNA expression patterns seen at different stages are consistent with the idea that antisense expression may be playing a regulatory role during development. For example, in kidney and liver virtually all cells at the early stages express the sense transcript. But whereas the protein product is also broadly distributed in liver the more limited distribution of the protein in kidney may be related to the more localized distribution of antisense RNA during development of this organ.
Because hnRNP A1 helps regulate nuclear-cytoplasmic transport and alternative splicing for well-defined classes of transcripts, its own regulation can provide the basis for post-transcriptional control of the partitioning of organ primordia into distinct gene expression domains. It is therefore significant that the hnRNP A1 gene is widely transcribed throughout the early embryo and its encoded protein is subject to numerous demonstrated and potential autoregulatory effects at the post-transcriptional level: it helps splice its own pre-mRNA, it may transport its own mRNA from the nucleus to the cytoplasm, and it may participate in the regulation of its own synthesis by its gene's antisense transcript. Small changes in the balance of any of these processes, or of other possible but speculative ones, such as unmasking of maternally inherited hnRNP A1, or even transfer of the protein from one cell to another, could thus activate a post-transcriptional cascade leading to the local expression of hnRNP A1, and with it the expression of its target gene products.
The recent recognition that a large proportion of the genes constituting the human genome are alternatively spliced (Ewing and Green, 2000) (a recent estimate indicates that 38% of human mRNAs contain possible alternative splice forms; Bretta et al., 2000) highlights the centrality of the developmental regulation of hnRNP A1 and other nonconstitutive splicing factors in the generation of complexity in vertebrate organisms.
Classes of hnRNP Proteins
In eukaryotes, heterogeneous nuclear RNAs (hnRNAs), which are the products of RNA polymerase II, are extensively processed to produce messenger RNAs (mRNAs). mRNA processing includes capping, splicing, and polyadenylation (Dreyfuss et al., 1993) and involves the association of the hnRNAs with nuclear proteins collectively known as ribonucleoprotein (RNP) complexes (Dreyfuss et al., 1993; Michael et al., 1995). RNPs that directly bind to hnRNAs are classified as the hnRNPs and are involved in the splicing and shuttling of pre-mRNAs. Others are categorized into special classes such as small nuclear ribonuclear proteins (snRNPs) and include the U snRNPs (Bandziulis et al., 1989; Dreyfuss et al., 1993; Dreyfuss et al., 1988; Luhrmann, 1990; Steitz, 1988; Zieve and Sauterer, 1990). The mature transcript produced from the hnRNA-hnRNP-snRNP complex is transported to the cytoplasm by specific hnRNPs where it may associate with yet another set of RNPs involved in translational regulation and mRNA stability (Bandziulis et al., 1989; Dreyfuss et al., 1993; Luhrmann, 1990; Steitz, 1988; Zieve and Sauterer, 1990).
hnRNP proteins are highly conserved throughout the vertebrates, as well as having sequence homologies in the invertebrate Drosophila (Amrein et al., 1988; Robinow and White, 1988) (Bell et at., 1991; Dreyfuss et at., 1993; Inoue et at., 1990; Kay et at., 1990; Roth et al., 1991; Voelker et al., 1990; Von Besser, 1990), and are the most abundant proteins found in the nucleus (Dreyfuss, 1986, Dreyfuss, 1993). In HeLa cells two-dimendional gel electrophoresis has resolved 20 major groups of proteins. These proteins are designated as the heterogeneous nuclear ribonucleoproteins (hnRNPs) A1 (˜34 kDa) to hnRNP U (˜120 kDa), and categorized by structural motifs (Cobianchi, 1990; Dreyfuss et al., 1993; Matunis et al., 1992; Pinol-Roma et at., 1988). Furthermore, sequence analysis has determined that hnRNPs have one or more RNA-binding modules referred to as the RNP motif or RNA Recognition Motif (RRM) in addition to at least one other auxiliary domain (Dreyfuss et al., 1993). The RNP motif contains two consensus sequences, RNP1 and RNP2, within a domain of approximately 90 amino acid residues that are located about 30 amino acids from each other (Dreyfuss et al., 1993; Dreyfuss et al., 1988). The RNP 1 module is an octapeptide, Lys/Arg-Gly-Phe/Tyr-Gly/Ala-Phe-Val-X-Phen/Tyr, SEQ ID NO: 7 in the Sequence Listing (Adam et al., 1986; Dreyfuss et al., 1993), while the RNP2 module is a hexapeptide rich in aromatic and aliphatic amino acids and is less well conserved (Dreyfuss et al., 1993; Dreyfuss et al., 1988). Both of these consensus sequences are directly related to RNA binding (Dreyfuss et al., 1993; Merrill et al., 1988).
Functional and structural categories of human hnRNPs include:                (i) hn RNP A2/B1 complexes with the snRNPs and plays a role in splicing pre-mRNAs. Though localized in the nucleus of most tissues, A2 is also found in the cytoplasm of the squamous epithelium of the skin and the esophagus, and abundant amounts of A2 are found in the medulla, but not the cortex of the adrenal gland. Both A2 and B1 are found throughout spermatogenesis while A1 expression is repressed in spermatocytes (Kamma et al., 1999).        (ii) hnRNP C1 is involved in the post-translation base change of cytosine to uracil in the apolipoprotein (apo) B mRNA which codes for the catalytic subunit APOBEC-1, a protein involved in splicesome assembly. C1 may regulate apoB mRNA editing thus restricting the activity of the catalytic subunit (Greeve et al., 1998).        (iii) hnRNP D is involved in the immunoglobulin heavy chain recombination process by binding to the switching regions in conjunction with a B cell-specific duplex DNA binding factor (Dempsey et al., 1999), while transcriptional regulation of the complement receptor 2 (CR2) is achieved by hnRNP DOB through its binding of both single and double stranded DNA (Tolnay et al., 1997; Tolnay et al., 1999).        (iv) hnRNP K may play a role in cytosine-rich pre-mRNA metabolism and cell cycle progression. Highly upregulated levels of K have been found in transformed keratinocytes (Dejgaard et al., 1994).        (v) hnRNP H, H′ are posttranslationally cleaved to produce the C-terminal proteins H(C) and H(C′) both having a molecular weight of 35 kDa with localization primarily in the nucleus. In contrast, hnRNP F varies with its localization depending on the cell type and is predominantly cytoplasmic in some cells which may be important its function (Honore et al., 1999).        (vi) Autoantibodies of hnRNP A1, A2, B have been found in individuals with connective tissue diseases. In addition to the A/B proteins, hnRNP I has been found in patients with systemic sclerosis (SSc) and in particular, in individuals with pre-SSc or limited SSc. The A/B and I protein complexes may elicit autoimmune responses (Montecucco et al., 1996).        (vii) The hnRNP L protein, having an unknown function, is found both as a component of the hnRNP complex as well as in discrete nonnucleolar structures of the nucleoplasm in HeLa cells (Pinol-Roma et al., 1989).        (viii) Finally hnRNP R, an hnRNP P-like protein, was isolated from yet another individual with autoimmune symptoms and may be a component of subcellular particles that are found in autoimmune diseases (Hassfeld et al., 1998). This protein may have some relationship to the gene product of the TLS/FUS gene, an RNA binding protein identical to hnRNP P2, and first identified as a fusion protein in human myxoid liposarcomas (Calvio et al., 1995; Crozat et al., 1993; Hassfeld et al., 1998; Rabbitts et al., 1993).        
In addition, the hnRNP classes of RNA-binding proteins have been shown to be developmentally important in many embryonic tissues including the formation and maintenance of the nervous system (Dreyfuss et al., 1993), sex determination in Drosophila melanogaster (Bandziulis et al., 1989; Del Gatto-Konczak et al., 1999; Lynch and Maniatis, 1995; Lynch and Maniatis, 1996), neuronal splice activation (Del Gatto-Konczak et al., 1999; Min, 1997) and maintenance (Dreyfuss, 1993), and epithelial/mesenchymal differentiation (Johnson and Williams, 1993). In Drosophila, the embryonic lethal abnormal visual (ELAV) system proteins are required for correct differentiation and maintenance of neurons. In mammals the ELAV-like neuronal RNA-binding proteins HuB, HuC, and HuD are implicated in neuronal development and differentiation in both the central and peripheral nervous systems (Akamatsu et al., 1999; Kasashima et al., 1999). In other systems such as the human immunodeficiency virus (HIV-1) hnRNPs are involved in regulating exon 2 of the tat splicing gene (Del Gatto-Konczak et al., 1999; Si, 1997).
hnRNP A1
The hnRNP A1 protein contains two RNP consensus motifs, a glycine-rich auxiliary domain at its carboxy-terminus (Burd and Dreyfuss, 1994; Burd et al., 1989; Buvoli et al., 1990; Merrill et al., 1988), as well as an RGG box, also at its carboxy-terminus (Kiledjian and Dreyfuss, 1992). In addition to these motifs, the hnRNP A1 class of proteins contain a nuclear localization signal, within a domain of approximately 38 amino acids at the carboxy-terminal region of the protein (Michael et al., 1995). This motif, referred to as M9, is a novel nuclear localization signal (NLS)/nuclear export signal (NES) and is not homologous to the classical nuclear localization signal (NLS) found, for example in either the large T antigen of the SV40 virus or the bipartite basic NLS of nucleoplasmin (Izaurralde et al., 1997b; Kalderon et al., 1984; Michael et al., 1995; Robbins et al., 1991; Weighardt et al., 1995). The presence of the M9 motif allows hnRNPs to shuttle continuously between the nucleus and the cytoplasm (Dreyfuss et al., 1993). hnRNPs of the A1, A2/B1, D, E, I and K classes have this capability, while those of the C1, C2, and U class are found restricted to the nucleus (Izaurralde et al., 1997b; Michael et al., 1995; Pinol-Roma and Dreyfuss, 1992). Furthermore, hnRNP A1 is found bound to the poly (A)+ tail of RNA polymerase II transcripts in both the nucleus and the cytoplasm and data suggest that the hnRNP A1 protein is transported out of the nucleus with the mature message during the export process (Pinol-Roma and Dreyfuss, 1992). FIG. 1a shows the cDNA sequence designated SEQ ID NO:1 and FIG. 1b shows the amino acid sequence of chicken hnRNP A1 (indicated by CHKA1) designated SEQ ID NO:2 compared to the human hnRNP A1 amino acid sequence (indicated by HUMA1) designated SEQ ID NO:3.
FIG. 2 illustrates the structure of the human core hnRNP proteins A1, A1.sup.B, A2 and B1. The RNP-2 and RNP-1 conserved submotifs of RRM1 and RRM2, and the G domains of each protein are shown. hnRNP A1 (SEQ ID NO. 8) and A1.sup.B or hnRNP A2 (SEQ ID NO. 9) and B1 (SEQ ID NO. 10) are identical except for extra amino acid regions indicated by boxes. The sequences of the RNP-1 and RNP-2 submotifs are aligned. The dots in the alignment indicate amino acid identities. All recombinant proteins are in authentic form except for post-translational modifications. The numbers indicate the position of amino acid residues from the initiation codon Met1. Based on published cDNA sequences (Burd, 1989; Buvoli, 1990). After Mayeda et al. (1994).
hnRNP A1 and Splice Choices
In a multi-step process, uracil rich small nuclear ribonuclear proteins (U snRNPs) in association with the core hnRNPs A1, A2, B1, B2, C1, C2, and C3 (classified by increasing molecular weight), bind to the pre-mRNAs in an ordered manner at specific sequences forming the spliceosome (Beyer et al., 1977; Chung and Wooley, 1986; Del Gatto, 1996; Dreyfuss, 1986; Kumar et al., 1986; Mayeda and Krainer, 1992). Alternative splicing allows for the functional and structural diversity of gene products by the addition or deletion of elements as small as a single amino acid (as seen in the Pax-3 and Pax-7 gene products) (Lopez, 1998). Additional means of obtaining protein variants from a single transcript in a cell-specific manner include splice activation and splice repression (Del Gatto-Konczak et al., 1999).
Alternative splicing may involve the use of alternative 5′ or 3′ splice sites, optional exons, exclusive exons, or retained introns (Lopez, 1998). Except for intron retention, splicing patterns are under competitive control of splicing proteins (Lopez, 1998). Splice activation may involve multi-protein complexes on pre-mRNAs. An example of this is seen in the activation of the female specific dsx exon of Drosophila melanogaster by the female specific proteins, tra (transformer), tra-2 and SR (splice regulator proteins rich in arginine/glycine repeats) (Del Gatto-Konczak et al., 1999; Lynch and Maniatis, 1995; Lynch and Maniatis, 1996; Wang et al., 1998). In the mouse, the c-scr exon N1 is activated by the KSRP splicing factor (KH-type splicing regulator) (Min, 1997; Wang and Manley, 1997) which induces the assembly of five other proteins including hnRNP F (a pre-mRNA splicing factor which is associated with the TATA-binding protein, essential for transcription initiation (Del Gatto-Konczak et al., 1999; Min, 1997; Yoshida et al., 1999). This multiprotein complex activates the intronic splicing enhancer that splices the neuronal specific c-scr N1 exon in vitro (Del Gatto-Konczak et al., 1999; Min, 1997).
Splice repression involves protein binding to an intronic 3′ splice site and is seen in the female-specific Sxl protein of Drosophila. This interaction effectively blocks U2 snRNP and U2AF (U2 snRNP auxiliary factor) (Del Gatto, 1996; Del Gatto-Konczak et al., 1999; Lopez, 1998; Valcarcel et al., 1993). Other protein complexes may use exon sequences for splice repression.
Vertebrate genes including the human fibroblast growth factor receptor 2 gene (fgfr2), and the human immunodeficiency virus type 1 (HIV-1) tat gene contain exons that have sequences acting as exonic splice silencers (ESS) (Amendt et al., 1994; Amendt et al., 1995; Baba-Aissa et al., 1998; Caputi et al., 1994; Caputi et al., 1999; Del Gatto, 1995; Del Gatto, 1996; Del Gatto-Konczak et al., 1999; Gallego et al., 1996; Graham et al., 1992; Si, 1997).
The ESS of the human FGFR2 pre-mRNA contains a UAGG sequence in the kgfr exon (keratinocyte growth factor receptor-exon 8)(Del Gatto, 1996; Del Gatto and Breathnach, 1995; Del Gatto-Konczak et al., 1999). This sequence has homology to the high affinity consensus sequence 5′-UAGGGA/U-3′ recognized by hnRNP A1 (Del Gatto-Konczak et al., 1999). In in vitro studies, Del Gatto-Konczak et al. (1999) have demonstrated that hnRNP A1 can modulate splice choices by binding to a 10 mer ESS designated S10 (5′-UAGGGCAGGC-3′, SEQ ID NO: 5 in the Sequence Listing) or to a 6 mer ESS designated S6 (5′-UAGGGC-3′).
In in vitro studies, RNA molecules containing the splicing silencer sequence from the human fibroblast receptor 2 kgfr exon (IIIb) were capable of directing splice choice selection by the recruitment hnRNP A1 (Del Gatto-Konczak et al., 1999). When the following point mutations were introduced into the S6 ESS UCGGGC or UACGGC a two-fold decrease in hnRNP A1 binding was detected (Del Gatto-Konczak et al., 1999). Furthermore, it was determined that the targeting of hnRNP A1 to the ESS domain was through the glycine-rich motif at the C-terminus of the protein. In the human hnRNP A1 protein, the glycine-rich domains are found between residues 189-320: the RGG motif is specifically located at residues 189-247, followed by another glycine-rich motif from residues 239-320 (Del Gatto-Konczak et al., 1999). Silencing of the k-sam (kgfr) exon in these in vitro studies required the entire glycine-rich motif By examining the corresponding sequence in the chicken kgfr exon (IIIb exon 8) of fgfr2 it has been determined that the sequence corresponding to the human ESS is 5′-UAGGGAGGGC-3′, SEQ ID NO: 6 in the Sequence Listing).
Studies involving hnRNP A1 proteins demonstrated that it is capable of promoting RNA molecules to base pair into double stranded structures, therefore influencing pre-mRNA splicing by snRNPs (Burd and Dreyfuss, 1994; Buvoli et al., 1992; Eperon et al., 1993; Kumar and Wilson, 1990; Munroe and Dong, 1992; Pontius and Berg, 1990; Portman and Dreyfuss, 1994). In in vitro assays hnRNP A1, as well as the RNA binding protein splicing factor 2 (ASF/SF2) (a member of the SR nuclear phosphoprotein family) were capable of making splice choices at the 5′ splice site of pre-mRNAs that contain multiple 5′ splice sites and are essential for constitutive splicing (Caceres et al., 1997; Caceres et al., 1998; Del Gatto, 1996; Fu, 1995; Ge and Manley, 1990; Krainer et al., 1990; Manley and Tacke, 1996; Mayeda et al., 1993; Mayeda and Krainer, 1992; Mayeda et al., 1994; Munroe and Dong, 1992; Zahler et al., 1993).
In vitro studies suggest that hnRNP A1 and ASF/SF2 may act antagonistically and that the hnRNP A/B family of splicing proteins regulates the SR family both in vitro and in vivo (Caceres et al., 1998; Caceres et al., 1994; Mayeda and Krainer, 1992; Yang et al., 1994). In in vitro experiments, excess hnRNP A1 favored the distal 5′ splice site, in contrast to excess ASF/SF2 favoring proximal 5′ splice sites in a concentration-dependent manner resulting in alternate splicing patterns of many genes in specific cell types (Del Gatto, 1996; Mayeda et al., 1993; Mayeda and Krainer, 1992; Mayeda et al., 1994; Munroe and Dong, 1992). Burd and Dreyfuss (1994) have shown that the consensus sequence 5′-UAGGGA/U-3′ is a high affinity binding site of hnRNP A1 and that this sequence is similar to the 5′ and 3′ splice sites in vertebrate pre-mRNAs. In addition, the ability of hnRNP A1 to bind to this consensus sequence increased if it was duplicated and separated by two nucleotides, resulting in a dissociation constant of 1×10−9 M. While hnRNP A1 proteins are capable of binding to other pre-mRNA sites, binding affinity varies greatly over a >100 fold range, therefore classifying these proteins as sequence specific RNA binding proteins (Burd and Dreyfuss, 1994).
hnRNP A1 is also involved in self-splicing. The 4.6 kb human hnRNP A1 mRNA containing 10 exons encodes for the 34 kDa hnRNP A1 protein. The pre-mRNA for hnRNP A1 can be differentially spliced to produce the A1 form and A1B form (Buvoli et al., 1990). It has been shown that the human hnRNP A1B protein (FIG. 2) with an apparent molecular weight of 38 kDa, corresponds to the protein previous designated as hnRNP B2 (Buvoli et al., 1990). The A1B splice variant which contains an extra exon in the C-terminal region glycine-rich region (156 bp; 52 amino acids) has a higher affinity for ssDNA than the 34 kDa form though its abundance in the cell is only ˜5% that of hnRNP A1 (Buvoli et al., 1990).
More recently, Blanchette and Chabot (1997) have shown that alternative splicing of the hnRNP A1 pre-mRNA yields the A1 and A1B forms via 5′ splice selection and exon skipping, and that this process requires conserved elements. Studies have shown that the addition of the alternate exon 7B in the mature mRNA produces the hnRNP A1B protein (Buvoli et al., 1990). Furthermore, Blanchette and Chabot have demonstrated that the conserved intron element (CE1) upstream from exon 7B favors distal 5′ splice site selection. SR proteins, including SF2, which favor the proximal 5′ splice selection site, require U1 snRNP and U2AF when involved in the 5′ splice site stimulation of a 3′ splice site, as seen in the male specific 3′ splice site of tra in Drosophila (Blanchette and Chabot, 1997; Valcarcel et al., 1993). Interestingly, the CE1 element does not interfere with U1 snRNP binding and led to the discovery of an additional element CE610, which is located downstream from exon 7B. CE610 is also involved in distal 5′ splice site selection by secondary structure formation and exon skipping (Blanchette and Chabot, 1997). Since the SR family of splice selection proteins and hnRNP A1 act antagonistically for 5′ splice choices, where the SRs choose the 5′ proximal site and the hnRNPs the 5′ distal site (Weighardt et al., 1996), hnRNP A1 may be involved in modulating its own splicing (Blanchette and Chabot, 1997; Chabot et al., 1997; Del Gatto-Konczak et al., 1999; Mayeda et al, 1994).
Fibroblast Growth Factors (FGFs), Fibroblast Growth Factor Receptors (FGFRs), and FGFR-2 Splice Variants
Fibroblast growth factors (FGFs) are important mitogens in both cell proliferation and differentiation, but in some cases may act as antagonists and inhibit differentiation. Examples of FGF induced differentiation are seen in the stimulation of pre-adipocyte fibroblasts (Broad and Ham, 1983; Johnson and Williams, 1993; Serrero and Khoo, 1982), and hippocampal neurite outgrowth (Johnson and Williams, 1993; Walicke et al., 1986). Developmental roles have been demonstrated in embryonic mesodermal induction in Xenopus (Kimelman and Kirschner, 1987; Slack et al., 1987), and the inhibition of differentiation of myotubes has been shown in skeletal muscle (Linkhart et al., 1981). In addition to acidic FGF (aFGF or FGF1) and basic FGF (bFGF or FGF2), the family of FGFs, including keratinocyte growth factor (KGF) have been shown to stimulate the proliferation of mesenchymal and neuroectodermal cell types (Burgess and Maciag, 1989; Johnson and Williams, 1993). Using immunohistochemical analysis on chick embryo sections, FGF2 has been localized to the heart, myotome, limbs and muscles (Han, 1997; Joseph-Silverstein, 1989) as well as to the notochord, neural tissue, gut cells, and tubules in the mesonephric and metanephric kidneys (Dono and Zeller, 1994; Han, 1997). In addition to the previously mentioned tissues, Han (1997) localized this mitogen to the developing pharyngeal arches, specifically the maxilla and mandible. FGF2 plays an important role in morphogenesis and pattern formation in the vertebrate limb (Han, 1997; Noji et al., 1993; Riley et al., 1993; Savage et al., 1993), as well as in kidney development (Dono and Zeller, 1994; Han, 1997).
Receptors for the 19 known fibroblast growth factors (FGFs) (Hu et al., 1998; Ohbayashi et al., 1998) include the tyrosine kinase fibroblast growth factor receptors (FGFRs) (Johnson and Williams, 1993), the CFR receptor or cytosine rich FGFR, (Burrus and Olwin, 1989) and the heparan sulfate proteoglycans (HSPGs). In chicken, the genes for fgfrs1, 2, and 3 and 4 (fgfr-related kinase or frek) as well as the kgfr (exon IIIb-keratinocyte growth factor receptor) and bek (exon IIIc-bacterial expressed kinase) splice variants for receptors 1 and 2 have been cloned (Szebenyi et al., 1995). The vertebrate FGFRs contain the domains as described by Johnson and Williams, (1993). Modifications of FGFR isoforms are due to alternative splicing of the pre-mRNAs for each gene. In a schematic representation of human FGFR1, the extracellular region of the molecule has the following domains including a signal peptide region at its N-terminus, followed by three immunoglubulin-like (Ig-like) domains with an acid box between domain I and II. A membrane-proximal region precedes the transmembrane (TM) region. On the intracellular side, two tyrosine kinase domains that are separated by a kinase insert follow a juxtamembrane (JM) domain, and at the C-terminus is a C-tail domain. The third Ig-loop of FGF receptor 2 is involved in the chondrogenic process and can contain either the IIIa and IIIb (kgfr) or IIIa and IIIc (bek) exonic sequences (Johnson and Williams, 1993).
Using chick limb micromass culture, Szebenyi et al. (1995) have looked at changes in the expression of the FGFRs in differentiated cartilage and have found transcripts for fgfr1 in undifferentiated proliferating mesenchyme, fgfr2 in precartilage condensations, and fgfr3 in differentiating cartilage nodules suggesting spatiotemporal regulation in limb development. Binding of the FGFs to their receptors plays an important role in limb development through the regulation of cell survival, proliferation, and precartilage cell differentiation (Fallon et al., 1994; MacCabe et al., 1991; Niswander et al., 1993; Schofield and Wolpert, 1990; Szebenyi et al., 1995; Watanabe and Ide, 1993).
The messenger RNA splice variants IIIb (kgfr) and IIIc (bek) from fgfr1 and fgfr2, as well as fgfr3 were detected in nuclease protection assays on chicken limbs (Szebenyi et al., 1995). In addition, micromass cultures of stage 23-24 wing buds and in situ hybridization of stage 18, 23, 26, and 36 wings showed a spatial distribution of messenger RNA for fgfr1, 2, and 3. Furthermore, the probes for fgfr1 and 2 contained sequences for both the kgfr and bek splice variants and did not allow for the in vivo or in vitro detection of either of these isoforms. This is critical since there is a cell type specific role for the fibroblast growth factor receptor 2 isoforms, (FGFR2) kgfr (exon 8-IIIb) and bek (exon 9-IIIc), in precartilage differentiation. In addition, fibroblast growth factors (FGFs) influence cell function in a tissue-specific or developmental manner that can lead to defects such as craniosynostosis and syndactyly (Del Gatto, 1996; Mayeda and Krainer, 1992; Oldridge et al., 1999). This differential splicing of the pre-mRNAs produced by a single gene allows for the production of splice variants and results in forms that respond to the different growth factor isoforms in a highly specific manner. Cells that will differentiate into epithelia splice only the kgfr exon, while mesenchymal cells, including fibroblasts, as well as other cell types including endothelial cells, splice the 5′ distal bek exon (Del Gatto, 1996; Fallon et al., 1994; Johnson and Williams, 1993; Rubin et al., 1989; Szebenyi et al., 1995).
Mutations in the various receptors cause various skeletal defects (Oldridge et al., 1999). Pfeiffer syndrome, a mutation in the fgfr1 gene, presents with craniosynostosis as well as limb defects; Crouzon syndrome, a result of a mutation in the fgfr2 gene, presents with limb abnormalities; and type II achondroplasia or dwarfism, is caused by a mutation in the fgfr3 gene (Szebenyi et al., 1995). Apert syndrome or acrocephalosyndactyly type 1, presents with head, hand, and foot abnormalities (Anderson et al., 1999). Oldridge et al. (1999) looked at mutations in the fgfr2 gene of 260 unrelated Apert syndrome patients and found that 258 individuals have a missense mutation in exon 7 which lies between the 2nd and 3rd Ig-like loop domains. The remaining two individuals had an ˜360 bp insertion of an Alu-element either 5′ to exon 9 or within exon 9, which arose as de novo mutations in the paternal chromosome (Oldridge et al., 1999). Exon 9 corresponds to the 3rd Ig-like loop domain and contains the bek (IIIc) sequence (Oldridge et al., 1999). In early studies involving the role of the fgfr2 splice variants IIIb (exon 8) and IIIc (exon 9), Rubin et al. (1989) found that keratinocytes expressed the FGFR2 IIIb isoform and were stimulated by KGF. In addition, fibroblasts and endothelial cells expressed the FGFR2 IIIc isoform, and responded to FGF2 (bFGF) (Johnson and Williams, 1993; Rubin et al., 1989). In an RNA analysis of fibroblasts obtained from two Apert and two Pfeiffer syndrome patients having mutations in exon 9, severity of limb abnormalities directly corresponded to ectopic expression of the of the IIIc-kgfr form of the FGFR2. These data provided evidence of the role of signal transduction pathways through the KGFR form of the receptor in relation to syndactyly in Apert syndrome (Oldridge et al., 1999; Park et al., 1995; Wilkie et al., 1995).
FIG. 4 illustrates FGFR2 with positional mutations, polymorphic nucleotides, and primers using in the Oldridge study. Top shows leader sequence (L), acid box (A), three Ig-like domains (IgI, IgIII, and IgIII), a transmembrane region (TM), and a split tyrosine-kinase domain (TK1 and TK2). Exons 8 and 9 encode for the alternative splice variants of the second a half of the IgIII domain, which is depicted by the IgIIIb (kgfr isoform) and IgIIIc (bek isoform) respectively. Positional mutations of two Apert syndrome patients (1 and 2) with Alu insertions, as well as two patients with Pfeiffer syndrome (3 and 4) with nucleotide substitutions are also shown. After Oldridge et al., (1999)
Alternative Splicing
An embodiment of the present invention describes a method and reagents that influence alternative splicing in living cells. Alternative splicing is a mechanism by which a single gene may eventually give rise to several different proteins. Alternative splicing is accomplished by the concerted action of a variety of different proteins, termed “alternative splicing regulatory proteins,” that associate with the pre-mRNA in the cell nucleus, and cause distinct alternative exons to be included in the mature mRNA. These alternative forms of the gene's transcript give rise to distinct isoforms of the specified protein. The virulence of the HIV virus associated with AIDS depends on particular alternative splice choices, and several cancers, rheumatoid and osteoarthritis, and other inflammatory diseases, exhibit aberrant splice choices when compared to corresponding non-diseased tissues.
An embodiment of the present invention describes a novel means for influencing splice choice in living cells using polynucleotide-based reagents that compete for binding sites in alternative splicing regulatory proteins, and novel methods for using these reagents as therapeutics.
An embodiment of the present invention contains the following novel aspects, which will be taken up in order:                1. A novel method for influencing splice choice in living cells using polynucleotide-based reagents that compete for binding sites in alternative splicing regulatory proteins.        
Sequences in pre-mRNA molecules that bind to alternative splicing regulatory proteins can be found in introns or exons, and are known by the terms intronic splicing silencers or enhancers, and exonic splicing silencers or enhancers (ISS, ISE, ESS, ESE). No published paper in the Medline database reports the introduction into living cells of polynucleotide-based competitors for ISS, ISE, ESS, or ESE binding sites in alternative splicing factors. Burd, C. G., and Dreyfuss, G. (1994) identified a 20-mer RNA sequence that binds the alternative splicing factor hnRNP A1, but this was work done with isolated protein and nucleic acid components, not within living cells. Blanchette, M., and Chabot, B. (1999) and Breathnach and co-workers (Del Gatto, F., and Breathnach, R. , 1995; Del Gatto, F., Gesnel, M. C., and Breathnach, R., 1996; Del Gatto, F., Plet, A I, Gesnel, M. C., Fort, C., and Breathnach, R., 1997; Del Gatto-Konczak, F., Olive, M., Gesne, M. C., and Breathnach, R., 1999; have investigated the effects of various ISS, ISE, ESS, ESE-related sequences in splice choice, but all these experiments have been done in cell-free extracts, not within living cells.                2. Novel methods for using the reagents described above as therapeutics.        
Although it has been recognized for some time that the life cycle of the AIDS virus HIV involves alternative splicing (Amendt, B. A., Si, Z. H., and Stoltzfus, C. M., 1995; Si, Z., Amendt, B. A., and Stoltzfus, C. M., 1997; Si, Z. H., Rauch, D., and Stoltzfus, C. M., 1998; Del Gatto-Konczak, F., Olive, M., Gesnel, M. C., and Breathnach, R., 1999; none of these, nor any other, studies propose treating the disease with competitors of ISS, ISE, ESS, or ESEs.
However, it is likely that if agents that competed with alternative splicing regulatory proteins such as hnRNP A1 for the HIV tat protein ESS were introduced into HIV infected cells, as shown in the Dissertation of one of the inventors “Avian hnRNP A1, an mRNA Shuttle Protein-Exon Splicing Silencer: Developmental Regulation and Role in Chondrogenesis” (Department of Cell Biology and Anatomy, New York Nedical College), which is herein incorporated by reference in its entirety, indicates that the method is feasible and effective, and that the the viral infection would be attenuated (Purcell, D. F., and Martin, M. A., 1993). Indeed, splicing of HIV-1 pre-mRNA must be inefficient to provide a pool of unspliced messages which encode viral proteins and serve as genomes for new virions (Caputi, M., Mayeda, A., Krainer, A. R., and Zahler, A. M., 1999), and virus production is arrested in a natural HIV variant that has an aberrant ESS (Wentz, M. P., Moore, B. E., Cloyd, M. W., Berget, S. M., and Donehower, L. A., 1997).
With regard to cancer, it has been found that certain tumors, such as mammary carcinomas (Stickeler, E., Kittrell, F., Medina, D., and Berget, S. M., 1999) and colon adenocarcinomas (Ghigna, C., Moroni, M., Porta, C., Riva, S., and Biamonti, G., 1998) contain levels of hnRNP A1 and other alternative splicing regulatory proteins that are altered relative to related normal tissues. Moreover, this abnormality is reflected in aberrant splicing patterns of certain alternatively spliced gene products, such as the cell adhesive protein CD44, although the specific role of splice variants of CD44 in tumorigenicity and metastasis is unresolved (Sneath, R. J., and Mangham, D. C., 1998).
The neoplastic state is characterized by numerous other gene products that show aberrant alternative splicing patterns. These include the extracellular matrix protein fibronectin (Midulla, M. Verma, R., Pitnatelli, M., Ritter, M. A., Courtenay-Luck, N. S., and George, A. J., 2000), the proteolytic enzyme cathepsin B (Keppler, D., and Sloane, B. F., 1996), the breast cancer susceptibility gene BRCA2 (Bieche, I., and Lidereau, R., 1999), and the apoptosis-associated gene products Bcl-x (Xerri, L., Hassoun, J., Devilard, E., Birnbaum, D., and Birg, F., 1998), Bax (Oltvai, Z. N., Milliman, C. L., and Korsmeyer, S. J., 1993), and caspase 2 (Ich-1) (Jiang, Z. H., and Wu, J. Y., 1999). The apparent causal relationship of some of these aberrant splicing patterns to the neoplastic state, coupled with the emerging evidence that tumors express abnormal levels of alternative splicing regulatory proteins, suggest that treatment with agents that specifically inhibit these regulatory proteins, such as those methods and reagents disclosed and claimed herein, represent a promising approach to cancer therapy.
Inflammatory diseases such as rheumatoid and osteoarthritis also involve protein (e.g., CD44) that exhibit abnormal alternative splicing patterns (Croft, D. R., Dall, P., Davies, D., Jackson, D. G., McIntyre, P., and Kamer, L. M., 1997; Boyle D. L., Shi, Y., Gat, S., and Firestein, G. S., 2000), and it is reasonable to hypothesize that the resulting aberrant proteins, among which are secreted and cell surface molecules, contribute to the immune-mediated manifestations of these diseases. Again, these data suggest that treatment with agents that specifically target alternative splicing factors represent a promising therapeutic approach.
Several publications have suggested using an antisense strategy to alter splicing patterns as therapeutics for cancer and ceratin other diseases (but not AIDS) (Sierakowska, H., Gorman, L., Kang, S. H., and Kole, R. (2000); Mercatante, D., and Kole, R., 2000). The invention described herein is not an antisense strategy, and has many advantages over such a strategy.
Current treatment of AIDS uses multiple reagents (AXT, protease inhibitors) directed against different biological functions of HIV. The method and reagents according to an embodiment of the present invention are directed against a distinct cell-virus interactive function, alternative splicing, and should add productively to the spectrum of agents available for treatment of this disease. Current treatment for cancer involves the use of agents that are frequently highly toxic and nonspecific. The method and reagents according to an embodiment of the present invention will constitute therapeutics with high specificity for a biological function, alternative splicing, that is aberrant in many cancers.