Cells contain organelles, macromolecules and a wide variety of small molecules. Except for water, the vast majority of the molecules and macromolecules can be classified as lipids, carbohydrates, proteins or nucleic acids. Proteins are the most abundant cellular components and facilitate many of the key cellular processes. They include enzymes, antibodies, hormones, transport molecules and components for the cytoskeleton of the cell.
Proteins are composed of amino acids arranged into linear polymers or polypeptides. In living systems, proteins comprise over twenty common amino acids. These twenty or so amino acids are generally termed the native amino acids. At the center of every amino acid is the alpha carbon atom (C.alpha.) which forms four bonds or attachments with other molecules (FIG. 1). One bond is a covalent linkage to an amino group (NH.sub.2) and another to a carboxyl group (COOH) which both participate in polypeptide formation. A third bond is nearly always linked to a hydrogen atom and the fourth to a side chain which imparts variability to the amino acid structure. For example, alanine is formed when the side chain is a methyl group (--CH.sub.3) and a valine is formed when the side chain is an isopropyl group (--CH(CH.sub.3).sub.2). It is also possible to chemically synthesize amino acids containing different side-chains, however, the cellular protein synthesis system, with rare exceptions, utilizes native amino acids. Other amino acids and structurally similar chemical compounds are termed non-native and are generally not found in most organisms.
A central feature of all living systems is the ability to produce proteins from amino acids. Basically, protein is formed by the linkage of multiple amino acids via peptide bonds such as the pentapeptide depicted in FIG. 1B. Key molecules involved in this process are messenger RNA (mRNA) molecules, transfer RNA (tRNA) molecules and ribosomes (rRNA-protein complexes). Protein translation normally occurs in living cells and in some cases can also be performed outside the cell in systems referred to as cell-free translation systems. In either system, the basic process of protein synthesis is identical. The extra-cellular or cell-free translation system lo comprises an extract prepared from the intracellular contents of cells. These preparations contain those molecules which support protein translation and depending on the method of preparation, post-translational events such as glycosylation and cleavages as well. Typical cells from which cell-free extracts or in vitro extracts are made are Escherichia coli cells, wheat germ cells, rabbit reticulocytes, insect cells and frog oocytes.
Both in vivo and in vitro syntheses involve the reading of a sequence of bases on a mRNA molecule. The mRNA contains instructions for translation in the form of triplet codons. The genetic code specifies which amino acid is encoded by each triplet codon. For each codon which specifies an amino acid, there normally exists a cognate tRNA molecule which functions to transfer the correct amino acid onto the nascent polypeptide chain. The amino acid tyrosine (Tyr) is coded by the sequence of bases UAU and UAC, while cysteine (Cys) is coded by UGU and UGC. Variability associated with the third base of the codon is common and is called wobble.
Translation begins with the binding of the ribosome to mRNA (FIG. 2). A number of protein factors associate with the ribosome during different phases of translation including initiation factors, elongation factors and termination factors. Formation of the initiation complex is the first step of translation. Initiation factors contribute to the initiation complex along with the mRNA and initiator tRNA (fmet and met) which recognizes the base sequence UAG. Elongation proceeds with charged tRNAs binding to ribosomes, translocation and release of the amino acid cargo into the peptide chain. Elongation factors assist with the binding of tRNAs and in elongation of the polypeptide chain with the help of enzymes like peptidyl transferase. Termination factors recognize a stop signal, such as the base sequence UGA, in the message terminating polypeptide synthesis and releasing the polypeptide chain and the mRNA from the ribosome.
The structure of tRNA is often shown as a cloverleaf representation (FIG. 3A). Structural elements of a typical tRNA include an acceptor stem, a D-loop, an anticodon loop, a variable loop and a T.PSI.C loop. Aminoacylation or charging of tRNA results in linking the carboxyl terminal of an amino acid to the 2'-(or 3'-) hydroxyl group of a terminal adenosine base via an ester linkage. This process can be accomplished either using enzymatic or chemical methods. Normally a particular tRNA is charged by only one specific native amino acid. This selective charging, termed here enzymatic aminoacylation, is accomplished by aminoacyl tRNA synthetases. A tRNA which selectively incorporates a tyrosine residue into the nascent polypeptide chain by recognizing the tyrosine UAC codon will be charged by tyrosine with a tyrosine-aminoacyl tRNA synthetase, while a tRNA designed to read the UGU codon will be charged by a cysteine-aminoacyl tRNA synthetase. These synthetases have evolved to be extremely accurate in charging a tRNA with the correct amino acid to maintain the fidelity of the translation process. Except in special cases where the non-native amino acid is very similar structurally to the native amino acid, it is necessary to use means other than enzymatic aminoacylation to charge a tRNA.
Molecular biologists routinely study the expression of proteins that are coded for by genes. A key step in research is to express the products of these genes either in intact cells or in cell-free extracts. Conventionally, molecular biologists use radioactively labeled amino acid residues such as .sup.35 S-methionine as a means of detecting newly synthesized proteins or so-called nascent proteins. These nascent proteins can normally be distinguished from the many other proteins present in a cell or a cell-free extract by first separating the proteins by the standard technique of gel electrophoresis and determining if the proteins contained in the gel possess the specific radioactively labeled amino acids. This method is simple and relies on gel electrophoresis, a widely available and practiced method. It does not require prior knowledge of the expressed protein and in general does not require the protein to have any special properties. In addition, the protein can exist in a denatured or unfolded form for detection by gel electrophoresis. Furthermore, more specialized techniques such as blotting to membranes and coupled enzymatic assays are not needed. Radioactive assays also have the advantage that the structure of the nascent protein is not altered or can be restored, and thus, proteins can be isolated in a functional form for subsequent biochemical and biophysical studies.
Radioactive methods suffer from many drawbacks related to the utilization of radioactively labeled amino acids. Handling radioactive compounds in the laboratory always involves a health risk and requires special laboratory safety procedures, facilities and detailed record keeping as well as special training of laboratory personnel. Disposal of radioactive waste is also of increasing concern both because of the potential risk to the public and the lack of radioactive waste disposal sites. In addition, the use of radioactive labeling is time consuming, in some cases requiring as much as several days for detection of the radioactive label. The long time needed for such experiments is a key consideration and can seriously impede research productivity. While faster methods of radioactive detection are available, they are expensive and often require complex image enhancement devices.
The use of radioactive labeled amino acids also does not allow for a simple and rapid means to monitor the production of nascent proteins inside a cell-free extract without prior separation of nascent from preexisting proteins. However, a separation step does not allow for the optimization of cell-free activity. Variables including the concentration of ions and metabolites and the temperature and the time of protein synthesis cannot be adjusted.
Radioactive labeling methods also do not provide a means of isolating nascent proteins in a form which can be further utilized. The presence of radioactivity compromises this utility for further biochemical or biophysical procedures in the laboratory and in animals. This is clear in the case of in vitro expression when proteins cannot be readily produced in vivo because the protein has properties which are toxic to the cell. A simple and convenient method for the detection and isolation of nascent proteins in a functional form could be important in the biomedical field if such proteins possessed diagnostic or therapeutic properties. Recent research has met with some success, but these methods have had numerous drawbacks.
Radioactive labeling methods also do not provide a simple and rapid means of detecting changes in the sequence of a nascent protein which can indicate the presence of potential disease causing mutations in the DNA which code for these proteins or fragments of these proteins. Current methods of analysis at the protein level rely on the use of gel electrophoresis and radioactive detection which are slow and not amenable to high throughput analysis and automation. Such mutations can also be detected by performing DNA sequence analysis on the gene coding for a particular protein or protein fragment. However, this requires large regions of DNA to be sequenced, which is time-consuming and expensive. The development of a general method which allows mutations to be detected at the nascent protein level is potentially very important for the biomedical field.
Radioactive labeling methods also do not provide a simple and rapid means of studying the interaction of nascent proteins with other molecules including compounds which might be have importance as potential drugs. If such an approach were available, it could be extremely useful for screening large numbers of compounds against the nascent proteins coded for by specific genes, even in cases where the genes or protein has not yet been characterized. In current technology, which is based on affinity electrophoresis for screening of potential drug candidates, both in natural samples and synthetic libraries, proteins must first be labeled uniformly with a specific marker which often requires specialized techniques including isolation of the protein and the design of special ligand markers or protein engineering.
Special tRNAs, such as tRNAs which have suppressor properties, suppressor tRNAs, have been used in the process of site-directed non-native amino acid replacement (SNAAR) (C. Noren et al., Science 244:182-188, 1989). In SNAAR, a unique codon is required on the mRNA and the suppressor tRNA, acting to target a non-native amino acid to a unique site during the protein synthesis (PCT WO90/05785). However, the suppressor tRNA must not be recognizable by the aminoacyl tRNA synthetases present in the protein translation system (Bain et al., Biochemistry 30:5411-21, 1991). Furthermore, site-specific incorporation of non-native amino acids is not suitable in general for detection of nascent proteins in a cellular or cell-free protein synthesis system due to the necessity of incorporating non-sense codons into the coding regions of the template DNA or the mRNA.
Products of protein synthesis may also be detected by using antibody based assays. This method is of limited use because it requires that the protein be folded into a native form and also for antibodies to have been previously produced against the nascent protein or a known protein which is fused to the unknown nascent protein. Such procedures are time consuming and again require identification and characterization of the protein. In addition, the production of antibodies and amino acid sequencing both require a high level of protein purity.
In certain cases, a non-native amino acid can be formed after the tRNA molecule is aminoacylated using chemical reactions which specifically modify the native amino acid and do not significantly alter the functional activity of the aminoacylated tRNA (Promega Technical Bulletin No. 182; tRNA.sup.nscnd.sup..sub.TM : Non-radioactive Translation Detection System, September 1993). These reactions are referred to as post-aminoacylation modifications. For example, the s-amino group of the lysine linked to its cognate tRNA (tRNA.sup.LYS), could be modified with an amine specific photoaffinity label (U. C. Krieg et al., Proc. Natl. Acad. Sci. USA 83:8604-08, 1986). These types of post-aminoacylation modifications, although useful, do not provide a general means of incorporating non-native amino acids into the nascent proteins. The disadvantage is that only those non-native amino acids that are derivatives of normal amino acids can be incorporated and only a few amino acid residues have side chains amenable to chemical modification. More often, post-aminoacylation modifications can result in the tRNA being altered and produce a non-specific modification of the .alpha.-amino group of the amino acid (e.g. in addition to the .epsilon.-amino group) linked to the tRNA. This factor can lower the efficiency of incorporation of the non-native amino acid linked to the tRNA. Non-specific, post-aminoacylation modifications of tRNA structure could also compromise its participation in protein synthesis. Incomplete chain formation could also occur when the .alpha.-amino group of the amino acid is modified.
In certain other cases, a nascent protein can be detected because of its special and unique properties such as specific enzymatic activity, absorption or fluorescence. This approach is of limited use since most proteins do not have special properties with which they can be easily detected. In many cases, however, the expressed protein may not have been previously characterized or even identified, and thus, its characteristic properties are unknown.