With the completion of the sequencing of the human genomes and genomes of other organisms including, for example, the genomes of a wide and rapidly expanding number of prokaryotes, yeast, rice, rat, and dog, increasing attention has focused on the characterization and function of proteins, the products of genes. See, for example, Celestino et al., Gen Mol. Res. 3:421-431, 2004; Nature 436:793-800, 2005; Toh et al., Nature 438:803-819, 2005; Collins et al., Nature 422:835-847, 2003; and Cherry et al., Nature 387(6632 Suppl):67-73, 1997. The availability of sequence data and the growing impact of structural biology on biomedical research have prompted international efforts to determine protein structures on a large scale. Structural genomics (also referred to as “SG”) is a worldwide initiative aimed at determining a large number of protein structures in a high throughput mode (see, for example, Rost, Structure 6:259-63, 1998; and Stevens et al., Science 294:89-92, 2001). One such effort is the National Institutes of Health's Protein Structure Initiative, a large-scale, high-throughput (also referred to as “HTP”) effort to determine the three-dimensional atomic-level structures of a broad range of protein. These structures will be made widely available for clinical and basic studies that will expand the knowledge of the role of proteins both in normal biological processes and in disease. Initiatives, such as the Protein Structure Initiative, focus on an important aspect of proteins: the three-dimensional structures of proteins. While gene sequencing projects identify and arrange all the nucleotide bases in an organism's genetic material, efforts such as the Protein Structure Initiative will harness this genetic information to help identify and group into “families” all the natural shapes that proteins can form. To examine a protein's role in health and disease, and to explore ways to control its action, researchers seek to decipher the protein's shape, or structure. This structure reveals the physical, chemical and electrical properties of the protein and provides clues about its role in the body. See, for example, Norvell and Machalek, Nat Struct Biol 7 Suppl:931, 2000; the worldwide web at nigms.nih.gov/psi/ and rcsb.org/pdb/strucgen.html#Worldwide; and “From Genes to Proteins: NIGMS Catalogs the Shapes of Life,” NIH Record, February 2001.
In structural genomics-type high-throughput projects, thousands of genes must be inserted into expression vectors and it has become clear that protein expression and protein purification are limiting steps and a major expense. Traditional technologies of manipulating genes are too cumbersome and inefficient when one is dealing with more than a few genes at a time. See, for example, Rual et al., Curr Opin Chem Biol. 8(1):20-5, 2004.
While success rates for gene cloning are close to one hundred percent, only about twenty percent of targeted genes are successfully expressed and purified and an accurate crystal structure is obtained for only a fraction of those polypeptides that are expressed and purified. See, for example, Adams et al., Acc Chem Res 36:191-8, 2003; Brenner, Nat Struct Biol 7 Suppl:967-9, 2000; Brenner and Levitt, Protein Sci 9:197-200, 2000; Burley, Nat Struct Biol 7 Suppl:932-4, 2000; Chance et al., Biophysical Journal 82:454a-454a, 2002; Chayen, J Struct Funct Genomics 4:115-20, 2003; Lesley et al., Proc Natl Acad Sci USA 99:11664-9, 2002; and Christendat et al., Nat Struct Biol 7:903-9, 2000. Traditional technologies of manipulating genes are too cumbersome and inefficient when one is dealing with more than a few genes at a time. See, for example, Rual et al., Curr Opin Chem Biol. 8(1):20-5, 2004.
Current methodologies for determining protein structures are difficult and time-consuming. Thus, there is a need for products and methods that allow for the determination of protein structures in a low-cost and high-throughput manner.