1. Field of the Invention
This invention is in the area of modified biomolecules and methods of making such modified biomolecules. More particularly, this invention relates to protein engineering by chemical means to produce modified proteins where one or more peptide bonds are substituted by non-peptide linkage and one or more encoded amino acids may be replaced by unnatural amino acids or amino acid analogs or any other non-coded structure.
2. Related Art
Numerous attempts have been made to develop a successful methodology for synthesizing modified biomolecules such as proteins, glycoproteins, nucleotides, polysaccharides, and other biopolymers. Such modified biomolecules are invaluable for study of structure-activity relationships of native biomolecules and there is a growing number of commercial applications of these molecules for diagnostic or therapeutic purposes.
Structural modification of proteins and peptides, normally referred to as xe2x80x9cprotein engineeringxe2x80x9d involves the rationally designed alteration of structure with the aim of understanding protein structure and function and of creating a protein with new desirable properties. In the past, this has been principally carried out by site-directed mutagenesis or other techniques involving genetic manipulation. The major drawbacks of these prior art approaches are that amino acids replacing native amino acids are those that must be coded genetically. As a result, other structural variants such as unnatural amino acids or amino acid analogs cannot be introduced in the protein backbone. However, recent findings (Ellman, et al., Science, 255:197, 1992; Noren, et al., Science, 24:182, 1989) would allow unnatural amino acids or amino acid analogs to be incorporated into proteins in a site-specific manner. In this approach, a codon encoding an amino acid to be replaced is substituted by the nonsense codon TAG by means of oligonucleotide-directed mutagenesis. A suppressor tRNA directed against this codon is then chemically aminoacylated with the desired unnatural amino acid. Addition of the amino acylated tRNA to an in vitro protein synthesizing system programmed with the mutagenized DNA directs the insertion of the prescribed amino acid into the protein at the target site. Taking the enzyme T4 lysozyme, the above authors, incorporated a wide variety of amino acid analogs into the enzyme at alanine 82 position with a few exceptions, for example, of D-alanine not being incorporated.
While Schultz""s approach partially solves problems posed by biosynthetic protein engineering, it does not allow the alteration of the protein backbone at more than one target site to incorporate two or more different non-coded structural units. Also, by the very nature of the system, i.e., the fact that it relies upon a living system to produce the engineered protein, many substitutions or alterations, such as those which would result in a lethal mutation, cannot be done. The chemical synthesis would overcome the shortcomings left by the Schultz techniques. (reviewed by R. E. Offord, Protein Eng., 1:151, 1987). However, chemical synthesis is fraught with many difficulties such as the need of protection of unwanted reactive groups.
Overall, there is a definite need for a simple and efficient method for making a modified protein which posses desired properties. The present invention addresses such need and provides novel modified proteins.
This invention provides new and useful modified biomolecules. It also provides a new process for producing such modified biomolecules. In general, the modified biomolecules of this invention comprise two molecular segments, each selected from peptides, pseudopeptides, or non-peptide linear molecules linked through a non-amido linkage to form a peptide or pseudopeptide backbone, wherein one or the segment contains at least one non-coded structural unit and the non-coded structural unit does not form a part of the non-amido linkage. The chemical bonding of the two segments is by means of terminal reactive groups on one segment which react with reactive groups of the other segment molecule.
The process of this invention provides a directed ligation of the two molecular segments to create a desired bond at the ligation point(s) and comprises the steps of:
a. providing a first segment having at least one non-coded structural unit and attaching a first chemoselective synthon to the first segment at the terminal position thereof;
b. providing a second segment optionally containing at least one non-coded structural unit, and second chemoselective synthon at the terminal position thereof, the second chemoselective synthon being complementary to the first chemoselective synthon of the first segment; and
c. ligating the first segment and the second segment, whereby the first synthon of the first segment and the second synthon of the second segment forms a non-peptide linkage, wherein the first segment and the second segment are each selected from peptides, pseudopeptides, or non-peptide linear molecules, provided that both segments are not non-peptide linear molecules at the same time.
The above sequence a-c can be repeated by using a first modified biomolecule as the first segment to which a second segment or a second biomolecule is ligated. The present process also may include the step of ligating additional segments with the first and second segments which have been provided with additional terminal synthons that are compatible with the first and second synthons and chemoselective to synthons of the additional segments.
The present invention is therefore applicable in the chemical synthesis of various protein conjugates, such as proteins with reporter molecules, radionuclides, cytotoxic agents, nucleotides, antibodies, and non-protein micromolecules.
Preferably, the process of this invention involves a series of steps comprising:
a. sequentially coupling selected amino acids or amino acid analogs to a terminal amino acid or amino acid analog bound to a first resin support to form a first peptide segment-resin, the first peptide segment having about two to about one hundred amino acid residues;
b. covalently attaching a haloacyl moiety to the N-terminus of the first peptide segment-resin to form a haloacylpeptide segment bound to the first resin support;
c. cleaving the haloacylpeptide peptide segment from the first resin support;
d. sequentially coupling selected amino acids or amino acid analogs to a terminal amino acid or amino acid analog bound to a second resin support through a sulfur or selenium-containing bond to form a second peptide segment-resin, the second peptide segment having about two to about one hundred amino acid residues;
e. cleaving the second peptide segment-resin to form a second peptide segment having a thiol- or selenol-containing group at the C-terminus thereof; and
f. coupling the haloacylpeptide peptide segment and the second peptide segment to form a modified polypeptide.
The order of the sequence of steps a-b-c-d-e- is not critical to this invention. The sequence of steps a-b-c and the sequence of steps d may be conducted successively or separately. The entire sequence can be repeated in a chain-reaction manner.
Optionally, any reactive groups such as thiol that may be present in the peptide segments can be protected prior to step (f) and deprotected after step (f) is complete.
This invention, in its broadest sense, encompasses a biologically active protein comprising two molecular segments, each selected from peptides, pseudopeptides, or non-peptide linear molecules linked through a non-amido linkage to form a peptide or pseudopeptide backbone, wherein one of the segments contains at least one non-coded structural unit and the non-coded structural unit does not form a part of the non-amido linkage, provided that both segments are not a non-peptide linear molecule at the same time.
This invention further provides a modified protein represented by the formula:
R-L-Rxe2x80x2
wherein R and Rxe2x80x2 are the same or different and are each a residue of a peptide or pseudopeptide; and L represents a thiol ester or selenol ester linkage. Preferably, both R and Rxe2x80x2 comprise from about two to about one hundred amino acid residues.