The teachings of all of the references cited herein are incorporated in their entirety by reference. An understanding of the biological role of any gene comes only after observing the phenotypic consequences of altering the function of that gene in a living cell or organism. RNA interference (RNAi) is a well-established experimental technology for silencing gene expression both in cultured eukaryotic cells and living organisms. RNAi also be used as gene therapy for treating viral infections, cancer, vascular diseases and other diseases in which the down-regulation of a polypeptide would ameliorate the disease. RNAi induces the sequence-specific degradation of a single mRNA species by short interfering RNA (siRNA, a double-stranded small interference RNA), which is believed to be processed through the highly conserved Dicer family of RNase III enzymes in vivo. The process includes: 1) the delivery of homologous double-stranded RNAs (dsRNAs) to the cytoplasm of a cell. 2) dsRNA cleavage by the RNase III-like enzyme, Dicer, to 21˜23 bp siRNAs. 3) siRNA incorporation into a protein complex, the RNA-induced silence complex (RISC). 4) the antisense strand of the duplex siRNA guiding the RISC to the homologous mRNA, where the RISC-associated endoribonuclease cleaves the target mRNA, resulting in silencing of the target gene.
As used herein a siRNA is also an interfering nucleic acid (iNA) as some of these interfering nucleic acids may contain a deoxyribonucleic acid placed in the iNA to inhibit RNA nucleases. The siRNA molecule of interest can be synthesized in vitro by chemical and enzymatic (fur example by using the enzyme Dicer) methods. They can also be synthesized in vivo. When a synthetic oligonucleotide is cloned into siRNA expression vectors with a RNA polymerase III promoter (including U6, human H1, and tRNA promoters), or a polymerase II promoter with a minimal poly(A) signal sequence, siRNA can be transcribed in vivo. Typically a single promoter is used to express a short hairpin (shRNA) sequence, although two tandem polymerase III promoters have also been used to transcribe the sense and antisense siRNA sequences. In addition to plasmid-based systems, PCR-derived siRNA expression cassettes based on the single-promoter system is an alternative format for suppressing transfected gene activity.
The siRNA-mediated gene silencing efficiency is affected by many parameters. An important limiting fact is that only about 25% of selected target siRNA sequences are functional due to some factors, such as secondary structures, non-gene-specific reactions and other unknown factors. Thus several synthetic siRNAs need to be generated and tested for every target gene. Thus, it is very expensive and time consuming to identify a suitable iNA construct. Another difficulty is the fact that an iNA sequence may have enough complementarity to the sequence of a second, unintended RNA. This is called the so-called off-target effect. Furthermore, to find the best siRNA binding sites (RNAi drug targets) is very challenging. To resolve the off-target phenomenon, extensive studies have been done on selecting specific target sequences for siRNA. Using algorithms based on sequence-efficacy correlations is the current practice for designing effective siRNAs. Although these criteria significantly increase chance of success for achieving gene silencing, there are many highly effective siRNA sequences that are not determined by the current algorithms because different genes have different sequence preferences. To ensure that the best iNAs are identified, an siRNA library constructed from cDNA or DNA offers a better alternative way to search for sequences that have the best potential silencing effect. Moreover, such an siRNA library can be a useful research tool in functional genomics, and useful for screening RNAi therapeutic targets in a high-throughput manner.
Recently siRNA library approaches (such as whole genomic or gene-specific or domain-specific siRNA libraries) have become a powerful tool for screening RNAi therapeutic targets. In all those approaches, efforts to generate siRNA sequences having an appropriate length have used MmeI-a type II restriction enzyme, by which a maximum 20 nucleotides can be generated. However, a siRNA having a maximum length of 20 bp cannot completely mimic the cleavage product, having a length of 21-23 bp produced by an RNase III-like enzyme-Dicer. This can result in the best siRNA target sites being underrepresented. In all these approaches, the doable-stranded (ds) cDNA is randomly cleaved into small fragments by DNase I (some use restriction enzymes for fragmentation but the representation is an issue) and subsequently ligated to an artificial loop-anchor which contains a MmeI-type II restriction site, then digested by the MmeI restriction enzyme to cut 18-20 nucleotides away from the recognized site. Through complex and multiple process steps including a second anchor ligation, loop extension and PAGE purifications, the ds-cDNA is then converted into a 20-nt palindromic structure with a loop (shRNA) and finally cloned into an siRNA expression vector with an RNA polymerase III promoter. In all these approaches, efforts to generate siRNA sequences having an appropriate length have used a MmeI restriction enzyme, by which only an iNA having a maximum 20 nucleotides can be generated. This is the longest iNA that can be generated using the type II restriction enzymes.
However, there are a number drawbacks in this approach include the following: 1) An shRNA library cannot be generated by PCR due to a palindromic structure. The complicated steps together with heavy cDNA loss in multiple process steps make this approach difficult and impossible to be developed into a high-throughput tool for functional genomics and for siRNA therapeutic target screening. 2) A palindromic structure is unstable during cloning in E. coli. This can lead to reduction in library complexity and potential loss of the best therapeutic target sites. 3) An iNA having a maximum of 20 bp cannot completely mimic the cleavage products having 21 to 23 bp produced by an RNase III enzyme like Dicer.
An attempt to construct a siRNA library from cDNA using PCR has currently been reported. In this system, the dsRNAs corresponding to the cDNA of interest are prepared by T7 RNA polymerase mediated transcription from DNA templates flanked by a T7 RNA promoter and subsequent annealing. The dsRNAs are then digested with cloned human Dicer in vitro, yielding 21˜23 bp siRNAs. A modified bacterial RNase III can be used to replace Dicer, but the generated siRNA is 20˜25 bp. Cleavage products are denatured, purified by PAGE and dephosphorylated. RNA adapters are attached subsequently to the 3′- and 5′-ends of the cleavage products by T4 RNA ligase. RNAs are subsequently converted into dsDNA by RT-PCR using primers complementary to the adapters. After digestion with appropriate restriction enzymes, the 21˜23 bp siRNAs corresponding to the cDNA fragments are ligated into an siRNA expression vector having the dual RNA polymerase III promoters, U6 and H1. For a description of the U6 and H1 promoters see US patent application publication no. 20050064489. Taking advantage of PCR, this approach can tolerate the a heavy loss of starting material due to multiple process steps and still generate enough molecules for cloning. Another advantage is that by using RNA fragmentation with Dicer, a distinct random pools of iNAs having 21˜23 bp in length can be generated. However, the cDNA-RNA-cDNA conversion process steps are obviously a complex, and even more complicated than shRNA library construction described above. Furthermore, RNA degradation during multiple process steps (e.g., T7 DNA polymerase-mediated DNA to RNA transcription, Dicer digestion, RNA PAGE purification, dephosphate and anchor ligation as well as RT-PCR) is unavoidable, which may result in the loss of some of the best siRNA target sites.
Another attempt at siRNA library construction is based on DNase I digestion. In this approach, dscDNA is partially digested with DNase I, followed by PAGE gel purification isolating DNA fragments that are 20-30 bp in lengths. These fragments are either directly blunt-end cloned into siRNA expression vector or attached to a PCR anchor by ligation, followed by PCR amplification and subsequently cloning into siRNA expression vector. It sounds much simpler and straightforward. However, cutting a nucleic acid fragment that is 20˜30 bp length from a PAGE gel is very challenging. Contamination with smaller nucleic acid fragments that have a length of less than 16 bp and with larger nucleic acid fragments having a length greater than 30 bp cannot be avoided. The iNAs that are too short having a length of less than 16 bp results in iNAs that do not efficiently downregulate the target RNA. An iNA that has a length that is greater than 30 bp cannot be transfected into mammalian cells because their introduction into the mammalian cells activates an interferon and protein kinase R (NCR) pathways in the cells, resulting in nonspecific gene silencing and apoptosis. Such an siRNA library may contain a high frequency of undesirable (“junk”) clones which may not only drastically impair the overall efficiency of the approach, but also seriously compromise the integrity of the data that are generated. Thus, this approach is not ideal for screening for the best siRNA sequence site for functional genomics and RNAi therapeutics.
An ideal iNA library, especially a gene-specific library, should contain every site represented by multiple overlapping sequences, and individual sequences should have the widely accepted rational length of 19-23 bp, and should easily and simply be amplified by PCR to meet a high-throughput library construction format, accelerating the screenings for the best siRNA sequence site for functional genomics and RNAi therapeutics. Thus, there is a need to provide for a method for to produce a library or pool iNA constructs having a length of 19-23 bps, which can be produced in a high-throughput manner, which covers the target sequence of an RNA.