The present invention provides a method of isolation of compounds that bind to ribonucleic acid (RNA). In particular, this method is suitable for use in identifying compounds for which a cognate RNA binding site is known. This method allows the isolation of new compounds that bind specifically to RNA and may thus be used in the study of the interactions that are important for the normal function of the cell or for the activity of viruses. This method may also be used to identify RNA binding compounds with potential efficacy as pharmaceuticals.
Proteins have important roles in the structure and function of all types of RNA (mRNA, rRNA, tRNA and viral RNA), ranging from packaging nucleic acid molecules in a stable structural configuration to mediating all aspects of RNA metabolism. Many interactions between RNA and proteins are specific for consensus recognition sites defined by the nucleic acid sequence of the mRNA. It is these interactions in particular that often have profound effects on the relative stabilities of certain RNA species, or that affect the rate and degree of translation from these mRNAs and their locations within a cell.
A rapidly increasing number of examples provide evidence that gene expression can be regulated in the cytoplasm of eukaryotic cells at the level of mRNA stability, translation, or the subcellular localisation of mRNAs (reviewed by Sachs 1993; Merrick 1992). The biological processes controlled by such mechanisms range from biological processes as diverse as early embryonic development, sexual differentiation, intelligence and viral replication. Common to these post-transcriptional methods of control is the role played by specific interactions between cis-acting sequences in the mature mRNAs and regulatory binding proteins. Most commonly, the regulatory sequences are contained within the untranslated regions at the 5xe2x80x2 and 3xe2x80x2 ends of the mRNA transcripts.
In prokaryotes, numerous examples of post-transcriptional regulation by repressor proteins exist. In most cases, the repressor proteins directly or indirectly occlude the Shine-Dalgarno sequence and/or the initiation codon.
In eukaryotes, one of the best examples of specific interactions between RNA and binding proteins that has been elucidated to date is the regulation of ferritin and erythroid 5-aminolevulinate synthase mRNAs by iron regulatory proteins (IRPs). IRP1 or IRP2 bind to the iron responsive element (IRE) in the 5xe2x80x2 untranslated region of a mRNA in a position proximal to the CAP binding site and controls the initiation step of translation. It is a mutation in an IRE in the ferritin L-chain mRNA that leads to hyperferritinaemia/cataract syndrome in human patients (Beaumont et al, 1995).
More recently, further examples have been identified in eukaryotic cells, such as the L32 yeast ribosomal protein (Dabeva and Warner, 1993) and proteins acting on Drosophila spermatogenesis that are further candidates for repressors that are functionally similar to IRP. Examples have also been found that bind further downstream from the CAP structure, such as thymidylate kinase (Chu et al., 1993). LOX-BP binds to the 3xe2x80x2 region of 15-lipoxygenase mRNA and represses its translation (Ostareck-Lederer et al, 1994).
The biological importance of RNA binding protein-binding site effector pairs means that a useful mechanism is needed by which these RNA binding proteins can be identified and further examined. Such a strategy had not proven easy to devise, partly because, in many cases, RNA-binding proteins exert their functions in cells or tissues that are not readily amenable to biochemical analysis, such as specific areas of the brain or in the germ line. The unavailability of biochemical material imposes cumbersome limitations on the identification and cloning of biologically important RNA-binding proteins operating in such systems, particularly when compounded by a lack of possible genetic approaches.
To circumvent such limitations, alternative strategies to identify and study RNA-protein interactions have been devised, based on phage display (Laird-Offringa et al., 1995), transcription termination (Harada et al., 1996) or translation in E. coli, (Jain and Belasco, 1996) or on transcription initiation in yeast (SenGupta et al., 1996; Putz et al, 1996).
One approach to study RNA-protein interactions has previously been developed by the present inventors. This approach is based on the realisation that the binding of protein to specific sites near the 5xe2x80x2 end of an mRNA molecule causes its translation to be repressed both in mammalian cells and in yeast (Stripecke and Hentze, 1992; Stripecke et al., 1994; Gray and Hentze, 1994). RNA binding proteins with physiological functions unrelated to eukaryotic mRNA translation were found to function as translational repressor proteins when their specific cognate binding site was introduced into the 5xe2x80x2UTR in a position similar to that of the IRE in ferritin mRNA (Stripecke et al., 1994).
Due to the step-wise nature of the translational initiation process, in which ribosome subunits and cofactors must have access to its their binding sites on the mRNA molecule, translational repression using the Stripecke method appears not to be restricted by the normal physiological function of the RNA-binding protein used. The Stripecke method thus allows the study of almost any protein-RNA binding site pair.
In the Stripecke method, the cognate binding sites for the bacteriophage protein MS2-CP and the spliceosomal component U1A were each cloned into the 5xe2x80x2UTR of luciferase indicator mRNAs using a restriction enzyme site nine nucleotides downstream of the major transcription initiation site. The repressor proteins were cloned under an inducible GAL4/PGK promoter so that luciferase activity could be assayed in the presence and absence of binding protein. It was found that in the presence of both the cognate binding protein and the RNA binding site in the luciferase mRNA, luciferase activity decreased in correlation with the affinity between the binding protein and the binding site in the mRNA. This correlation was also shown to exist in HeLa cells. The translation of the luciferase indicator constructs could be decreased up to ten-fold by this mechanism of repression.
This system thus allows the study of the affinities between specific RNA binding sites and cognate positions. However, in order to be able to assay for luciferase activity the cells need to be harvested and lysed. This means that no linkage between phenotype (i.e. repression of translation caused by the presence of an efficacious binding protein) and genotype (the gene which coded for the active binding protein) is retained. This method is therefore only suited to the assessment of the degree of affinity with which a specific binding pair interact and essentially does not allow the isolation of novel RNA binding proteins or their coding genes.
There is thus a great need for a method of detection of nucleic acids that code for RNA binding compounds, particularly proteins.
According to the present invention there is provided a method for the isolation of nucleic acid coding for a RNA-binding compound, said method comprising the steps of:
a) expressing in a host cell
i) one or more candidate nucleic acids to be screened for coding for the RNA binding compound, and
ii) a nucleic acid comprising a coding region for a fluorescent marker protein, wherein the nucleic acid comprising the coding region for the fluorescent marker protein includes a binding site for the RNA-binding compound,
b) selecting a host cell that exhibits an altered level of fluorescence from levels found in a host cell containing ii) alone; and
c) isolating the nucleic acid coding for the RNA binding compound from the selected cell.
This approach is referred to as TRAP (Translational Repression Assay Procedure) and makes use of the discovery that the binding of a compound to a sequence that is present in a mRNA species prevents its translation. The mechanism of this prevention is thought to be by steric hindrance, that either precludes the binding of the ribosome at its binding site in the mRNA species or that prevents its passage along the mRNA molecule. This leads to a decrease in the amount of protein that is translated from the mRNA species.
In the method of the present invention, a binding site is introduced into the transcribed portion of a nucleic acid that codes for a selectable fluorescent marker. Preferably, the binding site is introduced into the Cap-proximal region of the 5xe2x80x2UTR of the transcribed mRNA. This altered nucleic acid is then introduced into a host cell, such as yeast.
A population of transformed host cells is co-transformed with a number of candidate nucleic acids for RNA binding compounds, such as, for example, are contained within a cDNA library. The candidate nucleic acids may code for proteins or for other nucleic acid molecules with RNA binding activity, such as ribozymes. In a preferred embodiment of the invention, the candidate nucleic acids are cloned under the control of an inducible promoter in order that expression of the encoded compound may be tightly controlled.
The candidate nucleic acid and the nucleic acid coding for the fluorescent marker may be introduced into the host cells as part of plasmids, or they may be integrated into the chromosome of the host cell.
The cells are then cultured in an environment that allows translation to occur. Co-transformants that carry a cDNA molecule encoding a compound that alters translation of the selectable fluorescent marker can then be identified. The binding of the compound to the mRNA prevents access of the translation machinery and so alters the degree of translation of the marker mRNA relative to cells which do not contain a compound that interacts with the RNA binding site. Some cells then exhibit altered levels of fluorescence relative to the level of fluorescence displayed by cells that only contain the nucleic acid that comprises the RNA binding site and encodes the fluorescent marker. These cells will possess altered levels of the fluorescent marker protein and therefore are likely to contain a candidate nucleic acid encoding a compound that binds to the RNA binding site of interest.
The method of the present invention is ideally suited to the selection of cells in which translation has been repressed. In this instance, cells will be selected that exhibit reduced levels of fluorescence to the mean fluorescence level.
It has been found that in order to identify RNA binding compounds of interest using this method, it is preferable to subject the cell population to multiple cycles of enrichment, each time selecting for cells that exhibit altered levels of fluorescence. Ideally, at least three cycles are performed.
This strategy allows the identification and cloning of nucleic acids encoding compounds, particularly proteins, that exist in very low abundance in vivo or for which there is a lack of suitable biological material to allow separation of the compound in sufficient amounts for cloning of the encoding gene. This method also allows the selection of nucleic acids encoding molecules other than proteins that possess an affinity for a particular RNA sequence motif. Such molecules include small bioactive peptides and catalytic RNA molecules such as ribozymes.
The method of the present invention allows the selection of a candidate nucleic acid that encodes an RNA binding compound. The RNA binding compound selected using the method of the present invention may have any number of physiological functions, such as recognition of a response element in an RNA sequence (such as in the case of the IRE), possession of ribonuclease activity, participation in the process of RNA splicing, contribution to the stability of an RNA molecule, tRNA synthetase activity or any other function or activity that derives from a specific interaction with an RNA sequence.
As used herein, the term xe2x80x9cRNA binding compoundxe2x80x9d is taken to mean any compound that interacts specifically with a known RNA binding site. By specific interaction is meant that the affinity constant for the interaction between binding protein and RNA sequence is lower than 10 xcexcM. Preferably, the affinity constant is lower than 1 xcexcM. By xe2x80x9cRNA binding sitexe2x80x9d is meant any motif that is present in the sequence of an RNA molecule. The sequence motif may comprise any number of nucleotides, although consensus sequence motifs for RNA binding proteins normally comprise between 5 and 40 nucleotides.
The RNA binding compound may comprise any compound or combination of compounds that can be encoded by DNA or RNA molecule, and may therefore comprise nucleic acid such as RNA or a ribozyme, or may comprise a peptide or polypeptide. Preferably, the RNA binding compound comprises a peptide or polypeptide.
If proteinaceous, the compound need not be an entire protein or protein module, but may instead be any peptide fragment or polypeptide component of a protein that possesses RNA binding activity. The compound may bind to RNA as a single entity or may dimerise to form homodimers or heterodimers that are then able to bind to a specific RNA sequence. In this instance, the complementary protein that is necessary for the binding activity must also be present in the cell.
Additionally, the peptide or polypeptide identified may not itself possess RNA binding activity, but may instead be necessary for the RNA binding activity of another compound that is present in the cell. For example, the identified peptide or polypeptide may form a heterodimer whose integrity is essential for the correct recognition of an RNA sequence motif.
According to a second embodiment of the present invention there is provided a method for the isolation of nucleic acid coding for a compound that alters the affinity of interaction between a RNA-binding site and a RNA binding compound, said method comprising the steps of:
a) expressing in a host cell
i) one or more candidate nucleic acids to be screened for coding for the compound that alters the affinity of interaction between the RNA-binding site and the RNA binding compound,
ii) a nucleic acid comprising a coding region for the RNA binding compound,
iii) a nucleic acid comprising a coding region for a fluorescent marker protein, wherein the nucleic acid comprising the coding region for the fluorescent marker protein includes a binding site for the RNA-binding compound,
b) selecting a host cell that exhibits an altered level of fluorescence from levels of fluorescence found in a host cell containing ii) and iii) alone; and
c) isolating from the selected cell the nucleic acid coding for the compound that alters the affinity of interaction between the RNA-binding site and the RNA binding compound.
The method of the present invention may also be used to identify compounds that alter the affinity of interaction between a known RNA binding compound and its binding site. For example, a cell population may be transformed with both a nucleic acid that comprises a binding site for a RNA binding compound and that encodes a fluorescent marker protein and an additional nucleic acid that encodes a RNA binding compound. This cell population may then be transformed with nucleic acids that encode, or incubated in the presence of agents that are candidates for altering the affinity of interaction between the RNA binding site and its cognate binding protein. Such agents may be pharmaceutical drugs or small molecules such as bioactive peptides. Cells that contain agents that effectively alter the affinity of interaction between RNA binding compound and binding site will exhibit altered levels of fluorescence relative to levels in cells that only contain a nucleic acid comprising the binding site for the RNA binding compound and that encodes the fluorescent marker protein and the nucleic acid that encodes the RNA binding compound.
As used herein, by xe2x80x9ccandidate nucleic acidxe2x80x9d is meant a nucleic acid that codes for a single nucleic acid or protein species. Accordingly, the candidate nucleic acid may be any fragment of a gene that encodes a compound with specific RNA binding activity or whose presence is required for the RNA binding activity. The candidate nucleic acid may be a naturally-occurring nucleic acid, or it may be partially or wholly synthetic. For example, the nucleic acid may comprise a wild type gene sequence that has been altered through the introduction into the gene of substitutions, mutations, insertions or deletions that alter the affinity of the compound for the RNA sequence motif. Gene fusions, that combine the two or more molecules so as to create a hybrid molecule with specific affinity for a specific RNA sequence may also be used.
The nucleic acid encoding the RNA-binding compound may be part of a gene library. In this scenario, the cDNAs representing the entire genotype of an organism are cloned into a suitable expression vector and a host strain is transformed with the vectors so that only one gene is present in each cell.
The nucleic acid may also be subjected to mutagenesis so as to create a library of different genes. The method of the present invention may then be used to identify mutations in a nucleic acid that give rise to an altered affinity in the encoded compound for a particular RNA sequence motif.
The candidate nucleic acid should ideally be cloned under the control of an inducible promoter, so that the system is easily manipulable. Transcription of the RNA-binding compound can then be controlled to coincide with the desired stage of growth of the host cell population.
Suitable inducible promoters are well known to those of skill in the art (see, for example, Sambrook et al., 1989; Molecular cloning; Cold Spring Harbor Laboratory Press). In yeast, the promoter of choice for the expression of the candidate nucleic acid is the PGK/GAL promoter. This promoter is silent in the presence of glucose in the growth medium, but when this sugar is replaced by galactose, expression of the RNA binding compound is induced.
The RNA sequence motif that forms the binding site for the compound of interest should be known in order that it can be introduced into the mRNA of the nucleic acid that codes for the fluorescent marker. However, the motif may be putative or hypothetical: it need not be known that any compound actually does interact with RNA containing this sequence. For example, it may be that a common consensus sequence found in an RNA species is hypothesised to represent a binding site. The method of the present invention then allows this theory to be accurately assessed. If the sequence is found to cause a decrease in the translation of the marker gene, then those cells in which lower translation levels are evident can be analysed for the presence of the nucleic acid of interest.
The sequence motif of interest may be inserted anywhere in the transcribed region of the nucleic acid that encodes the fluorescent marker protein. Preferably, the insertion site is in the 5xe2x80x2 or 3xe2x80x2 untranslated region (UTR) of the mRNA for the marker gene. Most preferably, the insertion site is in the Cap-proximal region of the 5xe2x80x2UTR, where it has been previously shown that binding of a protein can inhibit the access of the ribosome and so prevent efficient translation of the mRNA molecule. It is also preferable that the usual translation initiation signals that are present in the reporter mRNA are not disrupted, so that translation occurs normally in those instances when binding to the sequence motif does not occur.
The fluorescent marker protein encoded by the marker gene may be any protein whose presence in the cytoplasm of a cell causes the cell to fluoresce. Suitable proteins that are inherently fluorescent include the green fluorescent protein (GFP) that possesses a bimodal absorption spectrum and the blue-fluorescing GFP mutant (Heim et al 1994PNAS USA 91, 12501-12504). Other suitable mutants of GFP include the S65T mutant that possesses a single absorption peak centred at 490 nm or the S147P mutant which emits a stronger fluorescent signal than the GFP wild type at high temperature (Kimata et al., (1997) Biochim Biophys Res Commun 232(1) pp69-73). When the method of the present invention is performed using yeast as host cells, the selection marker is preferably the GFP mutant S65T. This is mainly because the background fluorescence in yeast is close to the optimal wavelength at which wild type GFP fluoresces. In contrast, the GFP mutant S65T possesses ideal physical properties for use in yeast cells.
Combinations of mutants may also be used in order to tailor the physical properties of the mutant GFPs to suit the application of choice (Heim and Tsien, (1996), Curr Biol 6(2), p178-182; Cormack et al, (1996) Gene, 173(1), pp33-38).
In an alternate embodiment of the invention, the protein need not necessarily be inherently fluorescent, although in this instance its concentration must be directly related to the level of fluorescence that is visualised in the cell. For example, the protein may possess an enzymatic activity that alters the fluorescence of a substrate in the cell. A suitable technique in this respect is described by Zlokarnik et al, (1998) Science, 279, pp84-88. In this instance, the marker gene used is xcex2-lactamase, the substrate for which is a membrane-permeant fluorogenic ester derivative termed CCF2/AM. This is introduced into the growth medium and will fluoresce only in cells that express xcex2-lactamase. In those cells that contain compounds that bind to xcex2-lactamase mRNA and thus repress its translation, lower levels of this enzyme result, so causing a lower level of fluorescence in the cell. This technique thus allows the visualisation of gene expression within the cell in the absence of a need to permeabilise the cell. The link between genotype and phenotype is thus retained, so enabling multiple enrichment steps to be performed and allowing the analysis of the selected cells"" genotype.
According to a further aspect of the present invention there is provided a recombinant nucleic acid encoding an RNA molecule operably linked to a promoter, the RNA comprising: i) a sequence encoding a fluorescent marker protein or fluorescent marker peptide, and ii) a binding site for a RNA-binding compound.
Preferably the promoter is an inducible promoter such as PGK/GAL. The nucleic acid may be cloned into a plasmid vector, suitable to allow expression in the host cell of choice.
The cDNAs for suitable marker genes are available in the literature. For example, the gene sequence for GFP is published in Chalfie et al, 1994 and the xcex2-lactamase gene has been widely known for some years (Sutcliffe et al, (1978) PNAS USA, 75, pp3737). Other suitable selection markers will be well known to those of skill in the art.
Preferably, the fluorescent marker should be cloned under the control of a constitutive promoter, such as, for example, the TEF1 promoter (translation elongation factor promoter) or alcohol dehydrogenase promoter. This ensures that any differences in the degree of fluorescence that are observed between cells are likely to be solely due to the activity of the RNA-binding compound.
Host cells suitable for use in accordance with the present invention include bacteria, yeast and eukaryotic cells. Easily manipulated bacterial species include E. coli or S. typhimurium. Yeast species such as S. cerevisiae, S. pombe, and C. albicans are particularly suited to this method. However, eukaryotic cell lines such as mammalian, amphibian or insect cells may also be used, particularly in circumstances where the presence of other proteins is necessary for the RNA binding activity of the compound of interest. For example, when the RNA binding compound requires the presence of a protein that is only found in appropriate amounts in a certain cell type, the method will need to be performed in these host cells.
The host cell of choice is yeast, more preferably S. cerevisiae. The reason for this preference is that yeast is a unicellular organism with a short generation time that is eukaryotic and that can thus be expected to translate and process eukaryotic proteins into their native form. Yeast cells are also very easy and inexpensive to culture and can easily be manipulated genetically. There are also a number of available marker genes such as URA3 and TRP1 that allow the initial selection of double transformants that contain both the gene for the RNA binding compound and the gene for the fluorescent marker protein. Furthermore, a number of inducible expression systems are well documented, such as that using the PGK/GAL promoter.
In the method of the invention, the nucleic acid encoding the RNA binding compound and the nucleic acid encoding the fluorescent marker protein must be co-transformed into the same host cell. The nucleic acids may be plasmid-borne, or may be integrated into the genomic DNA of the host cell. In order to allow for selection of doubly transformed host cells, both nucleic acids must be linked to a selectable marker such as an antibiotic resistance gene (such as the ampicillin resistance gene) or a metabolic enzyme (such as TRP1).
In the preparation of a strain of host cells suitable for use in accordance with the method of the present invention, the nucleic acid encoding the fluorescent marker is initially cloned into the host strain. The nucleic acids (or gene library) encoding potential RNA binding compounds is then transformed into the already-transformed strain using a second selection marker. (Obviously, these transformation steps may be performed simultaneously). In yeast, two suitable markers may be, for example, the TRP1 and URA3 selection markers. Double transformants containing nucleic acids for both the fluorescent marker protein and the RNA binding compound are then selected on media that are deficient in both tryptophan and uracil.
Any method of selection that allows the separation of non-fluorescent cells from fluorescent cells can be used to select cells expressing an RNA-binding compound that has successfully repressed translation of the fluorescent marker. The population of cells to be separated will depend upon whether repression or activation of translation is being selected for. However, in either case, the population of cells should be selected that are either more or less fluorescent than control cells. In the method of the first embodiment of the invention, control cells are cells that contain only the nucleic acid comprising the coding region for the fluorescent marker protein and the binding site for the RNA-binding compound. In the method of the second embodiment of the invention, control cells are those that contain the nucleic acid comprising the coding region for the RNA binding compound and the nucleic acid comprising the coding region for the fluorescent marker protein and the binding site for the RNA-binding compound.
Currently, the preferred technique that allows the fast quantitation of the fluorescence level of cells is flow cytometry. The most suitable technique for clonal selection of the cells of interest is fluorescence-activated cell sorting (FACS) that allows high-throughput screening of large numbers of cells. Furthermore, this procedure allows the recovery of living cells after sorting, meaning that multiple rounds of sorting may be performed in order to enrich for the cells of interest. The FACS technique has been shown to be applicable to most cell types, including yeast (Atkins et al, (1995) Curr Genet, 28(6) pp585-588), the host cell of choice in the method of the present invention.
FACS is normally used to select for cells that exhibit higher than average levels of fluorescence, although may be easily adapted to select cells exhibiting lower than average fluorescence levels. The method of the present invention necessitates the selection of cells that exhibit altered levels of fluorescence relative to the mean fluorescence level in the cell population, since it is these cells in which the level of translation of the fluorescent protein has been altered. These cells may express RNA binding compound s that have successfully bound to the sequence motif in the mRNA of the marker compound and that have thus altered translation of the fluorescent marker protein.
Preferably, the method of the present invention is designed so that translation of the mRNA encoding the fluorescent marker protein is repressed, through the interaction of a RNA binding compound and the RNA sequence motif that is present in the mRNA of the fluorescent marker protein. This interaction prevents the access of the translation machinery to the RNA binding site and thus causes a decrease in the level of translation from this mRNA species.
Preferably, the level of fluorescence is altered by at least 10%, more preferably 20-50%, most preferably greater than 50% relative to the mean fluorescence of the ell population. Most preferably, the level of fluorescence is lowered by this amount relative to the me an level in the host cell population.
It has also been found, surprisingly, that for the method of the present invention to be efficacious in correctly identifying cells that contain RNA binding compounds with specificity for the RNA sequence motif of interest, it is necessary to use multiple cycles of enrichment. It has been found that using just one selection step results in the identification of a large number of xe2x80x9cfalse positivexe2x80x9d clones, that do not possess RNA binding compounds that bind to the binding site of interest. Hence, it has been found necessary to culture the cells selected in the selection step in order to expand the selected population. The cells are then subjected to a compound of selection, using, for example, FACS. Preferably, at least three iterative cycles of culture and selection are performed.
In selecting cells displaying lowered levels of fluorescence, minimally-fluorescent cells are also selected. This population of cells is contaminated by cells that harbour mutations in the marker gene or that have lost the marker gene through deletion of the gene or loss of the plasmid, despite culture in selective medium.
In order to take this population of cells into account, the pool of minimally fluorescent cells that is selected in the final selection step is split into a number of groups, in decreasing order of fluorescence. Preferably, in order to strike a compromise between the least degree of work involved and the greatest efficiency of selection of positive clones, the final pool of cells is split into around five groups. It has been found that the group exhibiting the lowest fluorescence (group 5) includes deletants and revertants that have lost the marker gene. However, the population of cells displaying reduced but not minimal fluorescence (group 4) has been found to contain the greatest enrichment for RNA binding compounds of interest.
Additionally, the use of multiple cycles of enrichment allows counterselection against non-specific, constitutive loss of the marker gene, since when using an inducible system for the expression of candidate nucleic acids encoding the RNA binding compound, cells that display a specific reduction in fluorescence will shift to and can be recovered from the high fluorescent pool following incubation in glucose-containing medium (under conditions in which the candidate nucleic acids are not expressed).
Once the final pool of cells has been isolated, cells that display a non-specific reduction in fluorescence that is not due to the presence in the cell of a RNA-binding compound that exhibits specific affinity for the sequence motif of interest are removed by shifting to glucose-containing medium. The genotype of the remaining population of cells can then be analysed. Suitable methods will be well known to those of skill in the art and will include the sequencing of the nucleic acid that encodes the RNA-binding compound. It may be that this nucleic acid can then be subjected to mutagenesis, so creating a library of mutagenised nucleic acids. This library can then be used once again in the method of the present invention to select for compounds with altered affinity for the RNA sequence motif.
In the case of a proteinaceous RNA binding compound, the protein or peptide could be purified and further analysed. For example, the protein may be used in structure/function studies in order to aid the rational design of mimetics that bind to the same site in RNA. Suitable methods of analysis are now routine in the art and will be readily apparent to the skilled man.
According to a preferred aspect of the present invention there is provided a method for the isolation of a nucleic acid coding for a RNA-binding compound, said method comprising the steps of:
a) expressing in a yeast cell
i) one or more candidate nucleic acids to be screened for coding for the RNA binding compound, wherein each candidate nucleic acid comprises a coding sequence operably linked to a promoter inducible in said yeast cell, and
ii) a nucleic acid comprising a coding region for a GFP mutant that detectably fluoresces at a wavelength distinguishable from wild type GFP, wherein the nucleic acid coding for the GFP mutant comprises a constitutive promoter that is operably linked to the coding sequence and includes in the 5xe2x80x2 untranslated region a binding site for the RNA-binding compound,
b) selecting by FACS a yeast cell that exhibits a lowered level of fluorescence from levels in a yeast cell containing ii) alone; and
c) isolating the nucleic acid coding for the RNA binding compound from the selected cell.
According to a further aspect of the invention there is provided a kit comprising in one or more containers a recombinant nucleic acid encoding an RNA molecule operably linked to a promoter, said RNA molecule comprising: i) a sequence encoding a fluorescent marker protein or peptide, and ii) a binding site for a RNA-binding compound. The kit may further comprise one or more candidate nucleic acids each of which comprises a coding sequence operably linked to a promoter, for screening for a RNA-binding compound. The candidate nucleic acids may comprise a cDNA library.
According to a still further embodiment of the invention there is provided the use of a recombinant nucleic acid as described above, optionally contained within a host cell, in a method according to the invention, as described in detail above.
Various aspects and embodiments of the present invention will now be described in more detail by way of example, with particular reference to a system in which yeast cells are used as hosts for expression of the candidate and marker nucleic acids. It will be appreciated that modification of detail may be made without departing from the scope of the invention.
FIG. 1: illustration of the functional principle of TRAP in a yeast expression system
FIG. 2: Description of the plasmids used.
FIG. 3: Sequences and nomenclature of the GFP reporter plasmids (SEQ ID NOS.3-9). 
FIGS. 4A-4F: Isolation of cells expressing cognate RNA-binding proteins by FACS.
FIG. 5: Gel retardation assay for IRE binding assay using extracts from sorted cells.
FIG. 6: Graph showing the four different pools of cells exhibiting varying levels of fluorescence.
FIG. 7: Gel retardation assay for pools R2-R5, showing enrichment of IRE-binding activity.
FIG. 8: Analysis of IRP-I expression in selected cells.
FIG. 9: Gel retardation assay for U1 loop2 binding assay using extracts from sorted cells.
FIG. 10: Graph showing the four different pools of cells exhibiting varying levels of fluorescence.
FIG. 11: Gel retardation assay for pools R2-R5, showing enrichment of U1A-binding activity.
FIG. 12: Analysis of U1A expression in selected cells.
FIG. 13: Gel retardation assay showing IRE-binding activity of sorted cells.
FIG. 14: Graph showing the different pools of cells exhibiting varying levels of fluorescence.