The present invention relates generally to the fields of gene expression, gene therapy, and genetic immunization.
The expression of a protein gene product is influenced by many factors, including gene copy number, gene integration site or gene location in the genome, transcription factors, mRNA stability, and translation efficiency. For example, the expression of the human immunodeficiency virus-1 (HIV-1) structural genes gag, pol, and env is dependent on the Rev/Rev-responsive element (RRE) at a posttranscriptional level. This dependency on Rev is a limiting factor for gene expression. In addition, highly stable RNA secondary structures that form in various regions of the HIV RNA transcript can block or otherwise interfere with ribosome movement, and thus effectively limit translation. Formation of stable RNA secondary structures in gene transcripts is a general phenomenon that can limit the translational yield of many protein gene products for a wide variety of genes.
Kim et al., 1997, Gene, 199:293-301, which is incorporated herein by reference, optimized expression of human erythropoietin (EPO) in mammalian cells by altering the codons encoding the leader sequence and the first 6 amino acids of the mature EPO protein for the most prevalently used yeast codons, and changing the codons encoding the rest of the EPO protein for the most prevalently used human codons.
U.S. Pat. Nos. 5,972, 596 and 5,965,726 (Pavlakis et al.), which are incorporated herein by reference, describe methods of locating an inhibitory/instability sequence or sequences (INS: sequences that render an mRNA unstable or poorly utilized/translated) within the coding region of an mRNA and modifying the gene encoding the mRNA to remove the inhibitory/instability sequences with clustered nucleotide substitutions.
There is a need for new methods of expressing proteins and methods of increasing the level of protein expression of therapeutic and immunogenic transgenes. There is a need for methods of increasing the translational yields of any protein gene product. There is a need for methods of overcoming the limitations imposed by RNA secondary structure in RNA transcripts upon the ultimate level of protein expression of any gene. The present invention is directed to addressing these and other needs.
The present invention provides methods of producing protein in a recombinant expression system that comprises translation of mRNA transcribed from a heterologous DNA sequence in the expression system, said method comprising the steps of predicting the secondary structure of mRNA transcribed from a native heterologous DNA sequence; modifying the native heterologous DNA sequence to produce a modified heterologous DNA sequence wherein mRNA transcribed from the modified heterologous DNA sequence has a secondary structure having increased free energy compared to that of the secondary structure of the mRNA transcribed from the native heterologous DNA sequence; and using the modified heterologous DNA sequence in the recombinant expression system for protein production. The recombinant expression system may be a cell free in vitro transcription and translation system, an in vitro cell expression system, a DNA construct used in direct DNA injection, or a recombinant vector for delivery of DNA to an individual. The secondary structure of the mRNA transcribed from a native heterologous DNA sequence may be predicted using a computer and computer program. The native heterologous DNA sequence may be modified by increasing the AT content of the coding sequence, in particular, at the 5xe2x80x2 end of the coding sequence, or at the 5xe2x80x2 end of the coding sequence within 200, 150, or 100 nucleotides from the initiation codon.
The present invention also provides injectable pharmaceutical compositions comprising a nucleic acid molecule that includes a modified coding sequence encoding a protein operably linked to regulatory elements, wherein the modified coding sequence comprises a higher AT or AU content relative to the AT or AU content of the native coding sequence, and further comprising a pharmaceutically acceptable carrier. The encoded proteins may be immunogens or non-immunogenic therapeutic proteins. The modifications may be within the first 100 to 200 bases of the coding sequence, within stretches of sequences dispersed throughout the coding sequence, or within in the last 100 to 200 bases.
The present invention also provides recombinant viral vectors comprising a nucleic acid molecule that includes a modified coding sequence encoding a protein operably linked to regulatory elements, wherein the modified coding sequence comprises a higher AT or AU content relative to the AT or AU content of the native coding sequence.