Ribonucleic acid or RNA is a complex biomolecule made from ribonucleotide building blocks. A ribonucleotide comprises a nucleobase, a 5 carbon ribose sugar and one phosphate group. RNA contains four building blocks, these include: adenylate, guanylate, cytidylate and uridylate. These four RNA nucleotides contain the four RNA nucleosides adenosine, guanosine, cytidine and uridine respectively. RNA transcripts can be found in many cellular forms, including: messenger RNA (mRNA), transfer RNA (tRNA), ribosomal RNA (rRNA), micro RNAs (miRNAs), small interfering RNAs (siRNAs), and mitochondrial RNA. In cells, various RNA molecules play critical roles, for example, they control gene expression, sense and communicate responses to cellular signals, catalyze biological reactions, among many others.
There has been an intense effort to decipher the structure, function, and regulatory networks of the human genome. After sequencing the human genome, scientists have undertaken an immense task of identifying the information present in the genome and in particular, to identify and characterize the functional DNA sequences that are implicated in disease and genetic diversity. The project termed Encyclopedia of DNA Elements (ENCODE) has enlisted 32 groups around the world to identify regions of the human genome that are responsible for gene regulation. One of the valuable contributions of the ENCODE project will be to help make sense of Genome Wide Association Studies (GWAS). Several well documented GWAS studies have shown that specific genetic mutations are linked with disease risk. However, until the ENCODE project, many of these mutations were found in non-protein coding DNA regions (90%) leaving the researchers guessing as to how the mutations can be counteracted or what might cause the disease. The ENCODE project has revealed that many of the disease-linked regions of the genome include enhancers and other functional sequences and scientists are now beginning to understand the role of these enhancers and functional sequences in disease. Some of these important “non-coding” regions are ultimately transcribed into RNA, some of which are now known to be important regulators of gene expression. This regulation often occurs through structural elements that affect recognition by specific RNA binding proteins.
However, the predominant source of cells used to gather results in the ENCODE project have come from a very few select number of cell lines. There are literally thousands of additional cell types that will need to be interrogated and orders of magnitude higher genetic sequences, particularly RNA that will need to be examined once their significance in gene expression regulation has been determined. As yet, there are very few techniques to rapidly and sensitively map the topography of RNA structures for determination of function in gene regulation. The lag in RNA structure characterization techniques will further retard the discovery process that will lead to the understanding of RNA function and its regulatory elements impacting gene expression across the entire genome.
Protein-nucleic acid interactions are involved in many cellular functions, including transcription, RNA splicing, mRNA decay, and mRNA translation. Readily accessible synthetic molecules that can bind with high affinity to specific sequences of single- or double-stranded nucleic acids have the potential to interfere with these interactions in a controllable way, making them attractive tools for molecular biology and medicine. Successful approaches for blocking function of target nucleic acids include the use of duplex-forming antisense oligonucleotides or chemically modified oligonucleotide-like derivatives. In addition to specific RNA structures, the accessibility of different regions of the RNA was recently shown to be important in several processes such as the ability of microRNAs to bind their targets, control of translation speed and control of translation initiation. Gaining knowledge and an appreciation of the RNA structure in three dimensions may also be critical for the development and understanding of RNA-based molecules which may find great utility in a wide range of biotechnological applications, including rational design of biological and molecular sensors that may be useful in the treatment and monitoring of disease. Some of these applications may also provide a greater understanding of the interrelationship between nucleic acid structure and the effects of pH, analytes and proteins.
Nuclear Magnetic Resonance (NMR) spectroscopy is a powerful analytical technique used to determine qualitative and quantitative information about organic molecules. NMR has been used to solve and provide valuable information about the structure of a variety of chemical and biological molecules, ranging from small organic compounds to complex polymers such as proteins and nucleic acids. In NMR, a sample is placed in a magnetic field and is subjected to radiofrequency (RF) excitation at a characteristic frequency called Larmor frequency (f):
  f  =            γ              2        ⁢        π              ⁢          B      0      where γ is the gyromagnetic ratio of nuclei and B0 is the magnetic field strength. The nuclei in the magnetic field absorb the energy provided and become energized. The frequency of the radiation necessary for absorption depends on the type of nuclei to be excited, (e.g. 1H or 13C, or 15N), the frequency will typically also depend on the chemical environment of the nucleus (e.g., the presence of various chemical electronegative groups, salts, pH of solution, and the presence of binding agents), and lastly, the frequency may also depend on the spatial location in the magnetic field if the magnetic field is not uniform, i.e. the field is not homogeneous.
The use of chemical shifts as a new abundant source of structure and dynamics information is arguably more important for nucleic acid structure determination as compared to proteins. NMR structure determination of nucleic acids traditionally suffers from a shortage of accessible inter-proton NOE-derived distance constraints that can be applied towards structure characterization. This problem is compounded by a high degree of flexibility, particularly in RNA, which can complicate the interpretation of NOE-derived distance constraints.
An inherent obstacle in NMR structure characterization of biomolecules is the relatively poor sensitivity of the NMR procedure. The NMR signal-to-noise (S/N) ratio of biomolecules is impacted by the relatively low abundance of 15N (0.365%) and 13C (1.108%) and their gyromagnetic ratios (6.73 and −2.71 (107rad s−1 T−1) for 13C and 15N, respectively) being markedly lower than that of protons (26.75 (107 rad s−1 T−1)). The S/N can be approximated by the equation:S/N∝nγe√{square root over (γd3B03t)}where n is the number of nuclear spins being observed, γe is the gyromagnetic ratio of the spin being excited, γd is the gyromagnetic ratio of the spin being detected, B0 is the magnetic field strength, and t is the experiment acquisition time. Other factors that are involved in S/N are the probe filling factor (e.g., the fraction of the coil detection volume filled with sample), and various other probe and receiver factors that are typically approximately equivalent for equipment built in the same period of time. It is obvious to users that the highest field instrument available provides the best sensitivity. For fixed t, 20.5 times as much material with a 100 MHz NMR spectrometer than compared to a 750 MHz spectrometer would be needed to obtain an NMR spectra with identical S/N: N300/N750=[750/100]3/2=20.5. In high resolution (i.e. atomic resolution of approximately 1-5 Å) NMR mapping and structure characterization of biological molecules, such as RNA and DNA, the only feasible way to obtain a sufficiently resolved spectrum using chemical shift data is to increase the applied field (i.e. magnetic field strength and radiofrequency excitation). The NMR experiment consists of multiple cycles of pulsing, detection, and repetition delay. At high magnetic fields (600 MHz and higher), the repetition delay of a few seconds is necessary for typical biomolecules of interest to restore perturbed nuclei magnetization back to initial state for the next cycle. Since pulsing and detection combined is normally 80-150 milli-seconds, most of NMR time is spent on repetition delay.
The ENCODE project data to date indicates that a simple, high-throughput nucleic acid structure analysis method and device may help to alleviate the pressing need to link RNA structure to cellular function within the plethora of identified and as yet unidentified RNA molecules that may hold the key to resolving the pathogenesis of many important diseases. There remains a long-felt and unmet need to resolve these nucleic acid dynamic conformations as a means to yield structural information which may lead to the rational design of targeted, biologically-active compounds. One of the barriers to rapid dissemination of RNA structure resides in the lack of customizable, relatively inexpensive and high-throughput processes and devices for NMR analysis of RNA molecules. The understanding of three-dimensional structure of RNA and DNA will certainly apply to drug discovery, but still perhaps more significant applications such as identifying effects of nucleic acid mutations on structure and function and downstream gene regulation tantalizingly await.
The statements in this section merely provide background information related to the present disclosure and may not constitute prior art.