Proteins are necklaces of amino acids, long chain molecules. Proteins are the most important molecules inside every living cell, tissue, organ within the human body. Proteins are involved in virtually all aspects of life. Proteins control thinking, they regulate all physiological reactions, they metabolize carbohydrates and fats that bodies use, they defend bodies against bacteria and viruses, and they work as enzymes, hormones, antibodies, cytokines and signaling molecules that transmit information into cells. As enzymes, they are the driving force behind all of the biochemical reactions. As structural elements, they are the main constituent of bones, muscles, hair, skin and blood vessels. As antibodies, they recognize invading elements and allow the immune system to eliminate the unwanted invaders. While scientists have sequenced the human genome, how proteins work largely remains a mystery. This is because in order for proteins to function (e.g. as enzymes or antibodies), the protein must take on a particular shape, also known as a “fold”. If the protein does not fold correctly, disease and dysfunction occur. Some examples of which include, but are not limited to, Alzheimer's disease, Huntington's disease, cystic fibrosis, BSE (Mad Cow disease), an inherited form of emphysema, and even many cancers. When proteins misfold, they can clump together (“aggregate”). These clumps can often gather in the brain, where they are believed to cause the symptoms of Mad Cow or Alzheimer's disease.
When proteins fold inside a cell, they are frequently subjected to various amounts of spatial confinement. In many cases, proteins are folded in the endoplasmic reticulum (“ER”), which is a membrane-containing cellular compartment that contains many proteins at specific concentrations. Proteins can be encapsulated inside helper molecules, called chaperones and folding enzymes. These chaperones are involved with helping proteins fold inside cells. Therefore, protein folding inside the cell is quite different from its folding in the test tube. However, current studies of protein folding are mainly in the test tube. Although significant advance has been achieved, these results of protein folding in the test tube have to be verified in living cells and such a technique of study protein folding in the living cell is lacking.
The study of protein structure has received a major boost recently with the increasing amount of structures being deposited on the Protein Data Bank (PDB) on a daily basis, but these structures are typically not determined in vivo, but in artificial crystals and solutions. Over the past five decades, X-ray crystallography and the resulting atomic models of proteins and nucleic acids have contributed greatly to an understanding of structural, molecular, and chemical aspects of biological phenomena. Currently, X-ray crystallography is a mature high-resolution structural biology tool that can be used to quickly determine protein structures. However, X-ray crystallography requires high quality single crystal of proteins in order to do x-ray diffraction and this is not always achievable. In contrast, another high-resolution structural biology tool, nuclear magnetic resonance (NMR), has been developed since 1980s. This technique only requires protein in solution at 50 μM to 1 mM concentrations. This technique also provides protein dynamics information via NMR relaxation measurements. Although NMR is a less mature high-resolution structural tool, it provides an alternative high-resolution structural biology technique, allowing for determination of protein structures at atomic resolution.
When these methods cannot be used, computer-based protein modeling techniques have been used with some success. These modeling techniques use the known three-dimensional structure of a homologous protein to approximate the structure of another protein. This is not an accurate method because the actual structure is not known, but is approximated.
Fluorescence spectroscopy is another structural technique, which can be used to obtain structural information. The well-developed Forster resonance energy transfer (FRET) technique can be used to measure the distance between fluorescence donor and acceptor, and can thus provide important information about protein folding and structure. However, FRET can only provide one distance from one pair of fluorescence donor and acceptor each time. To determine a protein structure, hundreds to thousands of distances within a protein are necessary to determine protein structure at atomic resolution. This requires enormous amount of work, including mutagenesis, protein production, fluorescence labeling and FRET measurement of every pair of fluorescence donor and acceptor. The FRET measurement can also be obtained in the living cells since the introduction of Green/Red Fluorescence Protein (GFP/RFP) technique. The in vivo FRET measurement is widely used to study protein-protein and domain-domain interactions, however, the distance measurement between GFP and RFP seems to be meaningless, since both GFP and RFP are proteins of 25-28 kDa. Thus, the current in vivo FRET measurement cannot be used to obtain accurate information about protein structure and folding. Therefore, fluorescence spectroscopy is not considered to be a viable high-resolution structural biology tool.
Fluorescence imaging is a technique that is routinely used to study protein location and trafficking in living cells. Currently, this technique extensively utilizes GFP technique, which fuses GFP in either the N- or the C-terminal end of a protein. Using a confocal microscope, the GFP-labeled protein can be visualized for their locations and trafficking inside the cells. However, it is unknown whether the GFP fusion changes the location of the protein of interest inside the cells. Thus, extensive control experiments have to be carried out. Even with these control experiments, sometimes the situation inside the cells is complex and no definite conclusion can be made using fluorescence imaging.
Currently, there is no available means for detecting high-resolution protein structure in living cells. However, it is critical for scientists to verify if the in vitro determined protein structures are the same as the structures of these proteins in living cells. An in-cell structural biology is necessary to push the current cell biology to atomic resolution and no such technology is currently available. In addition, this in-cell structural biology will allow us to combine cell biology techniques with high-resolution structural biology techniques, thus to accurately correlate protein structural information with cellular functions.
Using bacteria to produce recombinant proteins opens the door of modern molecular biology. Indeed, bacterial expression enables us to utilize the recombinant DNA technique to produce large quantities of recombinant proteins. When bacterial cells are used to overexpress exogenous proteins, the recombinant protein is often sequestered in bacterial cell inclusion bodies. For the recombinant proteins to be useful, they must be purified from the inclusion bodies. During the purification process, the recombinant proteins are denatured and must then be re-natured. Denatured protein is commonly refolded in vitro by diluting the denaturant away. Protein unfolding normally induces a hydrophobic collapse that may cause protein aggregation. In vitro protein refolding results in the protein shielding its hydrophobic patches in the core of the molecule. Unfortunately, during the in vitro refolding process, proteins do not always form the native bioactive conformation. Two competing processes occur: refolding and aggregation. It is suggested that the driver for protein aggregation is hydrophobic amino acids exposed at the surface. Aggregation is undesirable and reduces the yield of functional, native protein.
Bacterial cells cannot be used to produce many proteins due to misfolding of these proteins in the bacterial cells. In addition, bacterial cells do not contain complex machinery for protein post-translational modification. However, protein post-translational modifications are critical for the biological functions of many proteins. Thus, the bacterial expressed recombinant proteins are not the same as the native proteins and are not functional. For production of native proteins, mammalian cells must be used. Unlike bacterial expression, the yield of mammalian cell protein expression is much lower and costly. A new technology of production of large quantities of properly folded, post-translationally modified proteins is definitely necessary for modern biology and medical sciences.
The impermeable nature of the cell membrane to peptide, protein, DNA and RNA limits the therapeutic potential of these “information-rich” biological molecules and prevents the uptake of the in vitro labeled macromolecules by cells for structural biology studies in the living cell at atomic resolution. However, a new, non-invasive protein transduction technology is emerging, following the discovery of the cell penetrating peptide (CPP) that is successfully used to efficiently transport heterogeneous bioactive cargo into the cell in an unconventional way. The protein transduction technology in vivo to deliver bioactive cargo into various tissues of living animals has been reported. This novel technology opens up many new possibilities for intracellular delivery of therapeutic macromolecules for treatment of human diseases or for intracellular transduction of labeled macromolecules for structural biology studies in living cells, thus potentially pushing cell biology to atomic resolution.
Despite these notable successes, the use of protein transduction technology has yet to become commonplace in cell biology and in therapeutic applications. Several major challenges lay in front of this new technology that prevent it to be widely used in many fields of biomedical sciences. The first challenge is the fate or secretion pathway of delivered exogenous proteins using protein transduction technology. It is still unknown how the exogenous proteins traffic inside cells after being delivered into cells. The famous Blobel's “Signal Theory” guides the fate of endogenous protein to traffic inside the cells, and thus dictates the subcellular locations of endogenous proteins. Questions have arisen regarding whether the exogenous proteins follow the same secretion pathway as that of the endogenous proteins. These questions have to be addressed for physiological and pathological relevance of protein transduction technology. The second major challenge is lack of delivery specificity of the current protein transduction agents, specifically, in terms of targeting to specific cell types and specific cell compartments. Indeed, the current protein transduction reagents are not “smart” enough to specifically deliver exogenous proteins into a targeting tissue type or cellular compartment.
Most human diseases are related to the malfunctioning of particular proteins, either systemically or locally. Therapeutic proteins, including native and engineered proteins, can be used as highly effective medical treatments (protein therapy) for a wide array of diseases in which the protein is either lacking or deficient (growth hormone and insulin), or the therapeutic protein is used to inhibit a biological process (antibodies that block blood supply to tumors). In contrast to gene therapy, protein therapy uses well-defined, precisely structured proteins, with previously defined optimal doses of the individual protein for disease states, and with well-known biological effects. However, an obstacle currently hinders protein therapy as a treatment of human diseases. This obstacle is the mode of delivery: oral, intravenous, intra-arterial, or intramuscular routes of administration are not always as effective as desired. In most cases, the therapeutic protein is metabolized or cleared before it even reaches the target tissue. To make protein therapy possible, an efficient delivery system of protein is required to ensure that therapeutic protein is stable and able to deliver to the target tissues for treatment of the diseases.