RNA reading has an important value in biological and pharmaceutical industries. Identifying expression levels of multiple genes between various biological samples enables to perform genotyping, study disease pathways and obtain improved diagnosis and prognosis of diseases among other applications.
Since the early 1990s, the simultaneous measure of the expression of thousands of different RNA gene products in a biological sample, such as a cell lysate, became feasible by the introduction of DNA microarrays (DNA chips). A DNA chip consists of numerous addressable locations. In each location numerous copies of a specific single strand DNA molecule (probes) are attached. When a sample containing a DNA strand that is complementary to one or more of the DNA molecules on the chip, hybridization takes place. With appropriate sample labeling strategies, a pattern indicating the identity of the DNA strands and their amounts is obtained. The chip with its large number of probes can identify, quantitate and compare the RNA sequences expressed in a set of samples (e.g. Nature Genetics, January 1999 Supplement).
The technology of DNA chips has several major drawbacks: design and production of chips is lengthy and expensive; assay performance takes several days and may include biased intermediate stages such as amplification; inaccurate quantitation and insensitivity to mRNA isoforms. The most prominent drawback of these methods is that they allow only partial analysis of gene products. In the case of oligo chips, only a predetermined oligo sequence designed for that chip can be detected. In the case of cDNA chips, the content of the chip is obtained by ‘trial and error’ and hence gene coverage is not guarantied. Moreover, any attempt to analyze hundreds of thousands of RNA isoforms would result in impractical chip density which would not enable to distinguish between RNA variants and isoforms. In addition, commercial off the shelf chips usually encompass well-known, recognized genes and thus analysis is limited to identification of such already well-known genes. Non-adequate evaluation of expression magnitude is another disadvantage of the commercial chips as it is common to have several spots that putatively cover the same gene, and show gross differences in expression estimates, sometimes of a factor of 3 or more.
A protein synthesis monitoring (also termed hereinafter “PSM”) system and methods of using same is disclosed by the inventor of the present invention in International Patent Application No. PCT/IL03/01011, Publication No. WO2004/050825, which is incorporated here in its entirety. PSM includes a plurality of markers, each marker encompasses a pair of interacting labeling moieties, the first moiety being attached to a ribosome or a fragment thereof and the second moiety being attached to one of the following entities: the ribosome or the fragment thereof, tRNA or amino acid. Protein synthesis in PSM is carried out by monitoring the signal sequences generated upon excitation of the markers. WO2004/050825 discloses that using the PSM system enables real-time monitoring of proteins synthesis in vivo and further allow identifying the amino-acid sequences of the protein being synthesized through database interrogation process.
U.S. Pat. No. 5,706,498 discloses a gene database retrieval system for retrieving gene sequences having a sequence similar to a sequence data from the gene database. The system is capable of storing the sequence data of genes whose structures or sequences were analyzed and identified. The system includes a dynamic programming operation unit for determining the degree of similarity between target data and key data by utilizing the sequence data of the bases of the gene from the gene database as the target data and the sequence data of the bases as the key for retrieval, and further contains a central processing device unit for allowing access to the gene database in parallel to the operation process for determining the degree of similarity. U.S. Pat. No. 5,706,498 merely provides a database retrieval tool in silico but does not teach or even suggest identification of mRNA molecules in cellular systems.
U.S. Pat. No. 5,856,928 discloses a system for characterizing and interpreting nucleotide and amino acid sequences. Natural numbers are assigned to represent DNA and mRNA nucleotide bases (n-numbers 0, 1, 2, 3), base pairing numbers in RNA (p-numbers 0, 1, 2, 3), and amino acids in protein (z-numbers with seventeen prime numbers and odd numbers 1, 25, 45; all smaller than 64). Gene and protein sequences may be represented, characterized and interpreted by their specific n-sums and z-sums. The system disclosed in U.S. Pat. No. 5,856,928 is in fact a representational scheme facilitating computation and characterization of nucleotide and amino acid sequences in silico. This system cannot provide mRNA identification in cellular systems.
Nowhere in the background art is it taught or suggested that mRNA may be identified by utilizing the putative transcription activity. Moreover, there is an unmet need to measure RNA through its natural role, namely as a template for protein production, rather than through reverse transcription followed and/or hybridization techniques.