The present invention relates to the field of molecular biology, and more particularly to genetic sequences encoding peptide display scaffolds capable of emitting light, and to peptide display libraries based on these scaffolds.
Proteins can bind to numerous chemical species, or ligands, including small organic molecules, nucleic acids, peptides, metal ions, and other proteins. Indeed, to carry out a biological function, a protein must interact with another entity. The capacity of amino acid polymers to participate in chemical interactions is one of the major reasons for their ascendancy in the biological world. Much as the AND gate is the basic component of binary computers, individual proteins and their cognate ligands are the fundamental mechanism upon which cells and organisms are built.
One of the most significant areas of research and development in the pharmaceutical industry involves methods to better design or screen for ligands that interact specifically with defined protein targets. Discovery of such ligands is the engine that drives development of new pharmaceutical compounds. Typically, efforts to find ligands focus on small molecules, antibodies, peptides, or RNA and DNA aptamers. Depending on the particular application, such ligands may provide lead compounds for drug development or probes for further research into biological processes.
A flurry of recent experiments has explored the utility of peptide binding assays for discovery of peptide-based ligands that bind specific protein targets in vitro. One of the most popular methods involves phage display, i.e., the presentation of peptide sequences on the surface of phage particles (Cwirla S. E., Peters E. A., et al. Proc Natl Sci USA August 1990; 87(16):6378-6482 and Cortese R., Monaci P., et al. Curr Opin Biotechnol December 1996; 7(6):616-621). Filamentous phage such as M13 and f1 have been engineered to express and present foreign peptide sequences. Two different approaches have been of primary interest; both involve incorporation during phage particle assembly of chimeric coat proteins that include segments of foreign sequence. The first involves the phage coat protein gp3 which is normally present on the phage coat in only a few copies per virus. Sequences that might be toxic at higher concentration on the viral coat, including relatively large protein domains, can be presented effectively using gp3 fusions. The second approach involves gp8, which is the major coat protein present in thousands of copies per virus. gp8 fusions have the advantage that they may reside on the virus in large amounts, thus increasing the avidity of the interaction between the virus and potential receptors. But as a consequence of this increased amount of fusion protein, the virus is more selective about which sequences can be displayed using gp8 (Makowski, L. Gene Jun. 15, 1993; 128(1):5-11).
Other modes of surface display have also been considered. Larger, more complex viruses including lambda and T4 have been exploited for surface display (Mikawa Y. G., Maruyama I. N. et al. J Mol Biol Sep. 13, 1996; 262(1):21-30 and Efimov V. P., Nepluev I. V., et al. Virus Genes 1995; 10(2):173-177). The basic approach is similar to that used for filamentous phages; that is, viruses are assembled in bacterial host cells which incorporate chimeric coat or tail fiber proteins that bear the foreign sequences. In contrast to filamentous phages, however, these viruses assemble completely inside the cytoplasm and are released through cell lysis; thus, coat proteins are cytoplasmic proteins as opposed to membrane proteins, a feature that may increase the flexibility of the display mechanism.
Bacterial cells have also been examined as vehicles for surface display. The general approach is to use a membrane protein (e.g., OmpA in E. coli) to display protein or peptide epitopes in an accessible manner on the cell surface (Georgiou G., Stephens D. L., et al. Protein Eng February 1996; 9(2):239-247). Even mammalian cells have been employed as vehicles for surface display. For example, membrane proteins such as CD4 and CD8 were first cloned by expression and ligand-based selection in mammalian cells. (Maddon P. J., Littman D. R., et al. Cell August1985; 42(1):92-104 and Littman D. R., Thomas Y., et al. Cell February 1985; 40(2):237-246).
One of the most appealing aspects of surface or phage display is the ability to screen complex peptide libraries for rare sequences that bind selectively to defined protein targets. The combinatorial chemistry required to generate a diverse population of peptides involves oligonucleotide synthesis. Furthermore, twenty amino acids with their wide spectrum of chemical properties (e.g., hydrophobicity, charge, acidity, and size) can create substantial chemical complexity, more so than, for example, nucleotides. However, like nucleotides, peptide libraries displayed on phage can be reproduced with relative ease. The replication requires nucleic acid intermediates, but the advantages of amplification are the same; namely, the capacity for biochemical enrichment without substantial loss of starting material, and the ability to perform genetic experiments.
Although surface display of peptides or proteins is useful for selecting ligands in vitro; it is less appropriate for selections that involve intracellular processes. For this application, expression systems inside the cell must be employed. Intracellular ectopic expression of antibody libraries is one mode of expression (Sawyer C., Embleton J., et al. J lmmunol Methods May 26, 1997; 204(2):193-203); a second involves expression of peptide libraries generated as fusions to cytoplasmic proteins such as thioredoxin and GAL4 from yeast (Colas P., Cohen B., et al. Nature Apr. 11, 1996; 380(6574):548-550 and Fields S., Song O. Nature Jul. 20, 1989; 340(6230):245-246).
Although for certain applications (e.g., construction of an interaction or proteome map), proteins or relatively large protein fragments are superior to peptides for display, for other applications, it is advantageous not to be constrained by natural protein sequences. To identify or devise novel proteinacious ligands and/or inhibitors of specific targets, it may be simpler to generate and examine a chemically diverse library of relatively low molecular weight compounds based on peptides. In addition, peptide libraries can be used in genetic selections and screens to pinpoint peptide ligands that bind important intracellular targets, similar to selections employed in, e.g., the yeast two-hybrid system (Fields S., Song O. Nature Jul. 20, 1989; 340(6230):245-246).
Though a potentially powerful tool, intracellular display of peptide libraries by the methods mentioned above suffers from several limitations. First, it is often difficult to know what the expression level of specific peptides or peptide fusions is; in many cases, even an average measure of expression level is difficult to obtain. Second, the diversity of the library is not easily estimated. It may be, for example, that only a small subset of possible peptide sequences are presented efficiently by a particular expression system. Third, it is not always easy to follow the expression of peptides in particular cells; for example, to know whether or not a specific cell is expressing a member of the library. Fourth, it is not generally possible to manipulate the library to alter its average properties once the library has been generated; for example, to isolate library sequences compatible with high expression. Fifth, efforts to restrict conformational freedom (in order to promote higher binding energies), e.g., by inserting the peptides into the interior of protein sequences may compound the problems discussed above. Such inserted libraries are likely to perturb the function and stability of the fusion partners in ways difficult to predict and measure. A method is therefore needed to overcome these limitations associated with peptide or protein fragment display libraries.
The present invention overcomes the above-mentioned limitations by providing methods and compositions for peptides or protein fragments displayed on scaffolds and libraries of sequences encoding peptides or protein fragments displayed on scaffolds that permit the properties of the library to be easily and quantitatively monitored. The scaffold is a protein that is capable of emitting light. Thus, analysis of the expression of individual members of the library when they are expressed in cells may be carried out using instruments that can analyze the emitted light, such as a flow sorter (FACS), a spectrophotometer, a microtitre plate reader, a CCD, a fluorescence microscope, or other similar device. This permits screening of the expression library in host cells on a cell-by-cell basis, and enrichment of the library for sequences that have predetermined characteristics.
A genetic sequence encoding a peptide display scaffold is used to create the libraries of the present invention. This scaffold sequence comprises a first sequence that encodes a molecule capable of emitting light. The first sequence contains a site, the location of which allows a second sequence to be inserted at the site while maintaining the ability of the molecule encoded by the first and second sequences to emit light.
These and other features, aspects, and advantages of the present invention will become better understood with regard to the following description, appended claims, and accompanying drawings.