Non-coding RNA (ncRNA) is a general term for RNAs that do not encode protein, and roughly divided into house-keeping RNAs and regulatory RNAs. There are ncRNAs having various lengths, and ncRNA molecules with less than 200 bases are called small RNAs.
Known examples of house-keeping RNAs include ribosomal RNA (rRNA); transfer RNA (tRNA); small nuclear RNA (snRNA), which is involved in splicing; and small nucleolar RNA (snoRNA), which is involved in modification of rRNA.
In recent years, regulatory RNAs have been attracting attention as factors having important functions for elucidation of biological functions. It is recently becoming clear that regulatory RNAs regulate gene expression and intracellular distribution of RNAs to play important roles in a gene expression-suppressing mechanism. The gene expression-suppressing mechanism in which regulatory RNAs function is called RNA interference (RNAi). This mechanism was revealed by experiments using C. elegans in 1988, and the presence of similar mechanisms in Drosophila and mammalian cells was revealed thereafter. ncRNAs as the regulatory RNAs have a chain length of about 20 to 25 bases, and their action mechanisms can be roughly divided into translational repression by microRNA (miRNA) and gene silencing through cleavage of a target mRNA by small interference RNA (siRNA) and heterochromatinization of a target DNA region.
A miRNA is transcribed as an RNA (precursor) having a hairpin-like structure from genomic DNA. This precursor is cleaved by a particular enzyme, dsRNA cleavage enzyme (Drosha, Dicer) having RNase III cleavage activity, and converted into a double-stranded form and then into single strands. It is thought that the antisense strand, which is one of the double-strands, is incorporated into a protein complex called RISC and the RISC are involved in suppression of translation of mRNA. Thus, miRNA takes various forms in the various stages after its transcription. Therefore, when targeting (detecting) a miRNA, various forms including the hairpin structure, double-stranded structure, and single-stranded structure need to be taken into account. A miRNA is an RNA of 15 to 25 bases, and the presence of miRNAs has been confirmed in various organisms.
In recent years, it has been suggested that a large amount of miRNAs are present in not only cells, but also body fluids such as serum, plasma, urine, and spinal fluid, which are samples (biological samples) containing no cells, and that the expression levels of those miRNAs should become biomarkers for various diseases including cancers. As of June 2014, there were no less than 2500 kinds of miRNAs in human (miRBase release 20) and, when a gene expression assay system such as a highly sensitive DNA microarray is used, expression of more than 1000 kinds of miRNAs among them can be detected simultaneously in serum or plasma. Thus, many studies are being carried out to find biomarkers by DNA microarray in body fluids such as serum/plasma, urine, and spinal fluid.
On the other hand, it is well known that, when gene expression analysis is carried out using a DNA microarray, the obtained data will include some errors depending on the sample, experimenter, and experimental conditions. Thus, methods of correcting measured values including such errors have been examined. Methods often used for the correction of the data are based on the principle that, when measured values of the expression levels of a plurality of genes are treated as a single cluster to be regarded as a gene expression data group, there is no difference in the expression level among any samples. Examples of such methods include the global normalization method, quantile method, lowess method, and 75 percentile method. However, these correction methods have the drawback that they can be used only when comprehensive detection of more than a certain number of genes is carried out.
On the other hand, there are also methods in which particular genes (for example, beta-actin and GAPDH) whose expression levels are the same among samples are used to correct data from each sample such that the measured values of such particular genes become constant value.
Also, when small RNAs are analyzed with a DNA microarray, a correction method used for gene expression analysis such as the global normalization method, quantile method, lowess method, or 75 percentile method described above is used. However, those methods cannot be used when only a particular gene(s) is/are to be detected. On the other hand, as methods of performing the correction such that the expression levels of particular genes become constant expression value, methods in which, among the small RNAs expressed in samples, housekeeping RNAs (U1 snoRNA, U2 snoRNA, U3 snoRNA, U4 snoRNA, U5 snoRNA, U6 snoRNA, 5S rRNA, and 5.8S rRNA) are used for the correction have been proposed (JP 2007-75095 A and JP 2007-97429 A).
In JP 2007-75095 A and JP 2007-97429 A, in detection of a miRNA which is a small RNA, the miRNA detection results are corrected such that the detection value of 5S rRNA detected simultaneously becomes constant value across all samples. However, there is no guarantee that the expression level of 5S rRNA is constant among the samples.
In JP 2014-007995 A, in detection of a miRNA which is a small RNA, mRNAs are detected simultaneously, and their representative value is used to correct the miRNA detection results. That method is also applicable when more than a certain number of mRNAs are detected and the distribution of the values of detected mRNAs is secured as the normal distribution.
Thus, for correction of errors in the expression levels among experiments, methods in which a nucleic acid standard substance is used in the process of the experiments, and the detected abundances of the standard substance are used to correct the errors among the experiments have been proposed (JP 2011-239708 A, JP 5229895 B and US 2010/0184608 A). JP 2011-239708 A and JP 5229895 B propose the sequence of the nucleic acid standard substance, the sequence of the nucleic acid probe for detection of the standard substance, and how to design them, and show the accuracy in the amplification step and the detection step so that evaluation of the performances of the detection methods is possible therewith. However, those publications do not actually show correction of errors among experiments including the step of extraction of nucleic acid from samples. US 2010/0184608 A also shows the sequence of a nucleic acid standard substance for correction of errors in the detection values of gene expression in samples. However, since that sequence is used in the step after amplification of nucleic acid, it merely allows correction of errors in the amplification step among experiments.
That is, the methods shown in JP 2011-239708 A, JP 5229895 B and US 2010/0184608 A enable evaluation of the accuracy and correction of errors in the measurement step including amplification and detection of nucleic acid only when a sufficient amount of nucleic acid is extracted from the samples, and the nucleic acid used can be quantified with high accuracy. However, when correction of measurement results is actually carried out among experiments, especially when small amounts of samples are used or when a body fluid is used as the samples, the amount of the target small RNA extracted is very small, and high accuracy measurement is impossible because of small amount of small RNA. Thus, correction by such methods is substantially impossible. It is therefore very important to carry out correction of errors among experiments including not only the step of detection of a small RNA, but also the step of its extraction from samples.
In view of the above, as methods of evaluating/correcting errors among experiments including the step of extraction from samples, methods using a standard substance have been studied. To date, correction using a standard substance that is a nucleic acid having a base length similar to those of small RNAs has been studied. For example, methods in which a short RNA having a base length of about 20 bases, which is a base length similar to those of miRNAs, as shown in Nobuyoshi Kosaka Edit., “Circulating MicroRNAs: Methods and Protocols (Methods in Molecular Biology)”, p 1-p 10, Human Press, New York (2013), is used as a standard substance, and extraction is carried out after adding a predetermined amount of this short RNA to samples, to carry out correction of errors in the step of extraction of a target small RNA in each experiment have been proposed.
For comparative analysis of the expression levels of target small RNAs among samples, correction of errors in the experimental conditions among the samples, especially correction of the difference in the extraction efficiency in the step of extraction of nucleic acid from the samples, is necessary. Although global normalization and normalization methods using house-keeping RNAs have been commonly used so far, they have drawbacks in targeting small RNAs as described above such as requirement of comprehensive detection of a large number of small RNAs and the absence of house-keeping RNAs whose constant expression can be secured among samples. Thus, those methods cannot be said to be effective for comparative analysis.
As described above, for the correction methods using a standard substance, correction using a nucleic acid standard substance having a base length similar to those of the target small RNAs has been studied. However, when an RNA having a base length similar to those of the target small RNAs is actually used as a standard substance, especially when a body fluid is used as the samples, the efficiency of extraction of the standard substance from the samples is unstable due to the influence of various conditions of the samples and various impurities contained therein, which results in instability of the measured values, and thus the accuracy cannot be secured. Therefore, the methods could not be used to correct measurement results among experiments.
Thus, to date, there has been no effective correction method utilizing a standard substance for comparative analysis of the expression levels of target small RNAs extracted from each sample, which method allows accurate correction of the measured values of the expression levels among the samples.
The Applicant hereby incorporates by reference the sequence listing contained in the ASCII text file titled SequenceListing.txt, created May 18, 2017 and having 8.60 KB of data.