Proteins are biomolecules made of amino acids linked together by peptide bonds to form amino acid sequences. Proteins perform a plethora of important functions in nature. Those functions are governed by their amino acid sequences and structure.
Proteins fold into complicated three-dimensional structures, which are characterized by four different structural terms. The primary structure of a protein is the linear representation of the protein's amino acid sequence. The secondary structure is the three-dimensional form of local segments of the protein, such as alpha-helices and beta-sheets. The tertiary structure is the protein's overall three-dimensional shape, fold, or architecture. The quaternary structure is the assembly of multiple polypeptide chains into a larger protein assembly, sometimes referred to as oligomeric assembly.
The tertiary structure forms through a process called “protein folding” in which some of the protein's amino acids interact with each other to cause the protein to fold into its three-dimensional conformation. Although the structure of a folded protein is complex, it is often symmetric to some degree. Therefore, in a symmetric protein, the tertiary structure can be simplified as a series of structural regions that appear multiple times in the protein.
Amino acid sequence segments that play a key role in folding a protein form what is called a “folding nucleus.” Studies have shown that the folding nucleus typically includes one-third to one-half of the overall polypeptide chain of single-domain globular proteins. Folding nuclei may be difficult to identify as they are not always defined by exon boundaries or contained neatly within an apparent structural repeating motif; they are considered to be a “cryptic” region within a protein. It appears that the presence of a folding nucleus is a protein design requirement, but there is no clear recipe for using a folding nucleus in protein design or how to complete the design of the remaining segments of the polypeptide to produce a robustly foldable protein.