The present invention concerns, firstly, solid phase translation in general of mRNA to give a protein (polypeptide), encoded by the mRNA, and secondly, as a subaspect, region specific labelling of one or more predetermined regions (part sequences) of a polypeptide chain (protein) by solid phase translation of mRNA in vitro. The invention also encompasses novel region specific labelled proteins/polypeptides. The labelled polypeptides have their primary use in structural studies by NMR.
The inventive method generalises an earlier mRNA-analogue (poly (U)) column translation method (Belitsina et al, 1975; Belitsina and Spirin, 1979) refined by Baranov et al., (1979), used to obtain ribosome in pre- and post-translocational states. Poly (U) as used in this earlier technique is not mRNA since it does not contain all the elements necessary for normal translation (stop codon, SD-sequence, (Shine-Delgarno sequence), ribosome binding site etc). Homopolymers of amino acids are not proteins. In the context of the invention homo polymers of amino acids are also excluded from the concept of polypeptides. In vitro translation has been described previously (e.g. Pavlov and Ehrenberg, 1996, and Ehrenberg et al, 1990).
NMR spectroscopy has over the past decade become a very powerful method to determine structures of small proteins in solution (Bax, 1989; Schwabe et al, 1990; Hxc3xa4rd et al, 1990; Baumann et al, 1993; van Tilborg et al, 1995). NMR has the intrinsic limitation that, as the studied proteins get larger, there is a drastic reduction in the resolution of their NMR spectra (Bax, 1989). This drawback has been partially overcome by the application of various isotope labelling strategies (Muchmore et al, 1989; Ramesh et al, 1994). At the same time, there is still a pronounced upper limit around 30 kD for the determination of protein structures at high resolution using NMR spectroscopy (Bax, 1989).
One may identify three major types of isotope labelling strategies for NMR studies. The first is xe2x80x9cuniformxe2x80x9d labelling, where all the different amino acids in a polypeptide are labelled with, e.g., 15N, 14C or 2H isotopes. A combination of 15N, 14C or 2H xe2x80x9cuniformxe2x80x9d labelling recently made it possible to determine the structure in solution of such a large molecular complex as the trp repressor in complex with operator DNA (Zhang et al, 1994).
A second strategy is xe2x80x9cselective xe2x80x9d isotope labelling of proteins. This means that only a limited class of isotope labelled amino acids are built into the polypeptide, while the other amino acids are unlabelled. Selectively labelled proteins are obtained from over-producing bacterial strains, which grow in media where one or several types of amino acids are isotope labelled. This strategy has become very useful for structural analysis with NMR (Muchmore et al, 1989; Ramesh et al, 1994 and references therein).
A third strategy, which we denote xe2x80x9cregion specificxe2x80x9d labelling, is more difficult to implement technically. This strategy means that labelled amino acids are incorporated only in one or more predetermined regions. One or more of all amino acid residues of a given peptide region may be labelled, while amino acids located outside the region may be unlabelled. We judge this, third strategy as the potentially most powerful way to extend the range of NMR-spectroscopy beyond the 30 kD limit to larger protein structures and complexes.
One way to obtain region specific labelling of proteins is by chemical polypeptide synthesis (Boutillon et al, 1995). At present, this method can only be used for small proteins.
A first objective is to provide an improved general method for in vitro translation enabling direct production of proteins in almost pure form.
A second objective is to provide a general method for region specific labelling of proteins based on in vitro translation as described e.g. by Pavlov and Ehrenberg, 1996, and Ehrenberg et al, 1990.
A third objective is to apply the solid support translation technology of the invention for implementing synthesis of region labelled proteins.
A fourth objective is to provide proteins that are isotope labelled at one or more predetermined regions.
The Invention
These objectives can be accomplished by a method that contemplates translation of real mRNAs stably linked to a solid phase to give real proteins polypeptides). These mRNAs contain in frame codons to be translated to an amino acid sequence. For procaryotes they also contain a Shine and Dalgarno sequence, a ribosornal binding site, an initiation codon and a termination codon, but one or several of these additional features may be deleted. For eucaryotes there is normally a cap structure at the 5xe2x80x2-end of the mRNA. One major advantage of this technique is that the ribosomes can be stalled at the stop codon that signals that the protein is full length (i.e., a terminating stop codon) as long as release factor is not included in the translation mixture. Subsequently, all components and factors necessary for translation can be rinsed off the solid phase in a simple way. After this step the solid phase linked mRNAs hold the ribosomes which hold the peptidyl-tRNAs that contain the protein of interest. Addition of the appropriate release factor (RF1 for UAA, UAG and RF2 for UAA or UGA) hydrolyses peptidyl-tRNA removing the protein of interest from the ribosomes that remain immobilized on the solid phase. Another rinsing step elutes the protein of interest together with catalytic amounts of RF1/2, making the final purification very simple.
Accordingly, the inventive method for production of proteins (polypeptides), is characterized in that mRNA encoding a protein of interest and bound to a solid phase is translated in vitro. In order to be able to obtain a highly purified form of the protein directly from the column translation is preferably done in two steps: first with a translation mixture containing all components for translation to a terminating, stop codon but devoid of the appropriate release factor activityxe2x80x94removal of the translation mixturexe2x80x94addition/introduction of the appropriate release factor activity.
A complete translation mixture is normally in the form of an aqueous buffer solution and allows for translation of the complete mRNA of interest and release of the so expressed protein from the tRNAxe2x80x94ribosomexe2x80x94mRNA complex. The mixture thus contains all ingredients necessary for translation, i.e. ribosomes, amino acids, amino acyl tRNA synthetases, tRNAs, initiation factors, elongation factors, energy giving system, buffering substances etc. See for instance the experimental part and references cited therein. The exact composition will depend on the origin of the various enzymes and factors utilized. For an E. coli origin, for instance, specific components, such as fmet-tRNAfmet, may have to be included.
By manipulating the translation mixture, translation may be paused and restarted at predetermined positions of the mRNA. The translation mixture may be added step-wise to the mRNA to be translated, such as a first portion (mixture) comprising ribosomes, initiation factors and for a procaryotic system fmet-tRNAfmet (for an eucaryotic system the initiator is met-tRNAmet) (initiation mix) followed by a second portion (mixture) comprising elongation factors, amino acyl tRNA synthetases, transfer RNAs, amino acids, energy giving system etc (translation mix). In the preferred case the translation mixture is added stepwise with mixes varying in composition enabling pausing and restarting of translation at predetermined and well-defined positions.
A pause in translation may be achieved by eliminating from a mixture the amino acyl-tRNA activity hat reads the codon, where the translating ribosome shall stop. This may be achieved by removing the corresponding amino acid from the translation mixture, preferably in conjunction with a defective tRNA synthetase activity for that amino acid. Alternatively, the pausing may be at an internal stop codon in the mRNA. Readtbrough of the internal stop codon is achieved by introducing suppressor tRNA activity that is specific for the internal stop codon, by adding suppressor tRNA changed with the correct amino acid at that position in the polypeptide sequence. Still another alternative for creating a pause at a defined codon is to use a mix which is lacking the isoacceptor tRNA activity for that codon. As indicated above it is often preferred to use two alternatives in conjunction for pausing at a desired codon. Restart is accomplished by addition/introduction of the lacking activity (activities).
By arranging for pausing at two or more internal codons, it is possible to define internal regions that can be translated specifically in the presence of a desired labelled amino acid mixture.
By running the translation before and after a pause with mixtures containing amino acids differing in labelling, region specific labelled proteins (polypeptides) will be obtained. A specifically labelled region may contain two or more amino acid residues in sequence. The maximal length of a region is the total length of the protein (polypeptide) produced minus one or two amino acid residues. Specifically labelled regions of one amino acid residuess may be better obtained by methods not belonging to the invention.
The label may be of any type provided it is compatible with translation. This in most cases means that the label is an isotope of one or more of the basic elements normally occurring in native amino acids, i.e. an isotope of N, C, O, H and S, that may be either radioactive (e.g. 3H, 35S, 14C) or non-radioactive (e.g. 15N, 13C, 2H). Isotope labelling includes cases where a larger part of the protein (polypeptide) contains a xe2x80x9cnon-normalxe2x80x9d isotope (e.g. 2H (deuterium)) while the region of interest contains the xe2x80x9cnormalxe2x80x9d isotope (e.g. 1H (protons). Labelling with isotopes means that the labelled amino acid carries more than normal amounts of the isotope concerned.
In one preferred mode of the xe2x80x9clabellingxe2x80x9d aspect of the invention, the mRNA encoding the polypeptide to be produced is contacted with a first translation mix A that permits translation but is deficient in at least one of the above-mentioned activities responsible for readthrough of a codon at which pausing is desired. In a subsequent step the nascent polypeptide ribosome-mRNA-solid phase complex is contacted with a second translation mix B that permits readtbrough of the codon at which the translation previously had stopped for mix A. Readthrough can be accomplished by including in mix B the activity/activities causing the pausing for mix A. In analogy with mix A, mix B can be deficient in a component specific for a predetermined amino acid encoding codon or for an internal stop codon (downstream the pausing position for mix A), which means that translation in conjunction with mix a will be halted at this second predetermined codon. Optionally, mix B may be replaced with further translation mixes (C, D, and so on), such that a subsequent mix permits restarting of translation from a predetermined codon at which the translation is paused for the closest preceding mix. By allowing the amino acid compositions for juxta-positioned translation mixes to differ with respect to labelling of one or more amino acids, region specific labelling may be achieved in any preselected region/regions of interest in a protein. By providing release factor (RF1 and RF2) adapted to the terminating stop codon used (UAA, UAG, UGA) the completed polypeptide will be released from the tRNA-ribosome-mRNA complex attached to the solid phase.
The temperature is selected according to rules known in the art, which normally means 10-40xc2x0 C. for systems derived from eubacteria or eucaryotes, bearing in mind that a too low temperature will give a slow translation and a too high temperature will result in denaturation and inactivation of the enzymes and other proteins involved. For systems derived from archaebacteria consideraly higher temperatures may be used. The pH conditions are also selected as outlined in the art for translation in vitro.
The solid phase used (support, carrier) are as a rule of the same type as those used as chromatographic adsorbents or supports for solid phase synthesis of oligo and poly nucleotides/peptides and encompass particulate as well as monolithic material, all of which in the preferred form should exhibit a hydrophilic contact area towards the liquid medium used. The support may be porous or non-porous. The link between the solid support and the mRNA may be of any type as long as it can withstand the conditions applied during translation/initiation and is compatible with efficient translation. Preferably the link is of covalent nature, for instance by attachment to the support via a terminal part, preferably a non-coding spacer sequence, of the mRNA, with preference for the 3xe2x80x2-end of the mRNA. Potential alternative ways of linking mRNA to supports involve attachment of affinity ligands to terminal parts of mRNA in combination with complementary receptors attached to the support. Biotin-streptavidin and anti-hapten high affinity antibodies are interesting ligand-receptor pairs that can be used.
Solid phase bound mRNA may also be a consequence of transcribing solid phase bound DNA encoding the protein/polypeptide of interest (e.g. cDNA). In this type of binding one may start amplifying target DNA by PCR using two primers one of which is biotinylated. The amplified biotinylated DNA is then transferred to a streptavidine coupled solid phase where it becomes firmly bound to the solid phase through a streptavidine-biotin complex. The DNA is then transcribed by an RNA polymerase to prepare mRNA. mRNA is retained in complex with the RNA polymerase used and the DNA and can be translated in the same manner as for mRNA directly bound to the solid phase.
The main advantage with mRNA-linked solid supports is that a nascent or finished polypeptide, still linked to peptidyl-tRNA ribosome and mRNA, can easily be separated from all other components in a translation mix by simple washing. This makes switching between labelled and unlabelled amino acids easy, and gives almost pure protein in the final elution step.
After the complete translation and release of the polypeptide from the ribosome, the polypeptide should be isolated and purified. This can be done by techniques well known in the art, bearing in mind that the use of solid supports carrying the mRNA to be translated is very advantageous for quickly obtaining a highly purified labelled polypeptide.
One aspect of the invention comprises a labelled polypeptide/protein characterized in that its amino acid sequence contains one or more regions of two, three or more labelled amino acid residues in sequence. The label is preferably an isotope as described above. The two or more labelled amino acids that are in sequence may be the same or different. By region is meant part sequences of two or more amino acids situated at chosen places in the sequence of the full length polypeptide/protein. The labelled part sequence may extend up to the full length protein although in order to be called a part sequence at least one or more amino acid must be missing. In a subaspect of this aspect, at least one amino acid is occurring at least twice, with the provision that the labelling of such an amino acid differs in at least two positions.