Proteins are large organic compounds made of amino acids arranged in a linear chain and joined together by peptide bonds between the carboxyl and amino groups of adjacent amino acid residues. The sequence of amino acids in a protein is defined by a gene and encoded in the genetic code. Although this genetic code specifies 20 “standard” amino acids plus selenocysteine and—in certain archaea—pyrrolysine, the residues in a protein are sometimes chemically altered by post-translational modification: either before the protein can function in the cell, or as part of control mechanisms. Proteins can also work together to achieve a particular function, and they often associate to form stable complexes. Like other biological macromolecules such as polysaccharides and nucleic acids, proteins are essential parts of organisms and participate in every process within cells. Many proteins are enzymes that catalyze biochemical reactions and are vital to metabolism. Proteins also have structural or mechanical functions, such as actin and myosin in muscle and the proteins in the cytoskeleton, which form a system of scaffolding that maintains cell shape. Other proteins are important in cell signaling, immune responses, cell adhesion, and the cell cycle. Proteins are also necessary in animals' diets, since animals cannot synthesize all the amino acids they need and must obtain essential amino acids from food. Through the process of digestion, animals break down ingested protein into free amino acids that are then used in metabolism.
Proteins are linear polymers built from 20 different L-α-amino acids. All amino acids possess common structural features, including an α carbon to which an amino group, a carboxyl group, and a variable side chain are bonded. The side chains of the standard amino acids have different chemical properties that produce three-dimensional protein structure and are therefore critical to protein function. The amino acids in a polypeptide chain are linked by peptide bonds formed in a dehydration reaction. Once linked in the protein chain, an individual amino acid is called a residue, and the linked series of carbon, nitrogen, and oxygen atoms are known as the main chain or protein backbone. The peptide bond has two resonance forms that contribute some double-bond character and inhibit rotation around its axis, so that the alpha carbons are roughly coplanar. The other two dihedral angles in the peptide bond determine the local shape assumed by the protein backbone. Due to the chemical structure of the individual amino acids, the protein chain has directionality. The end of the protein with a free carboxyl group is known as the C-terminus or carboxy terminus, whereas the end with a free amino group is known as the N-terminus or amino terminus.
Proteins are assembled from amino acids using information encoded in genes. Each protein has its own unique amino acid sequence that is specified by the nucleotide sequence of the gene encoding this protein. The genetic code is a set of three-nucleotide sets called codons and each three-nucleotide combination stands for an amino acid, for example AUG stands for methionine. Because DNA contains four nucleotides, the total number of possible codons is 64; hence, there is some redundancy in the genetic code, with some amino acids specified by more than one codon. Genes encoded in DNA are first transcribed into pre-messenger RNA (mRNA) by proteins such as RNA polymerase. Most organisms then process the pre-mRNA (also known as a primary transcript) using various forms of post-transcriptional modification to form the mature mRNA, which is then used as a template for protein synthesis by the ribosome.
The process of synthesizing a protein from an mRNA template is known as translation. The mRNA is loaded onto the ribosome and is read three nucleotides at a time by matching each codon to its base pairing anticodon located on a transfer RNA molecule, which carries the amino acid corresponding to the codon it recognizes. The enzyme aminoacyl tRNA synthetase “charges” the tRNA molecules with the correct amino acids. The growing polypeptide is often termed the nascent chain. Proteins are always biosynthesized from N-terminus to C-terminus. The size of a synthesized protein can be measured by the number of amino acids it contains and by its total molecular mass.
Most proteins fold into unique 3-dimensional structures. The shape into which a protein naturally folds is known as its native state. Although many proteins can fold unassisted, simply through the chemical properties of their amino acids, others require the aid of molecular chaperones to fold into their native states. There are four distinct aspects of a protein's structure:                Primary structure: the amino acid sequence        Secondary structure: regularly repeating local structures stabilized by hydrogen bonds. Because secondary structures are local, many regions of different secondary structure can be present in the same protein molecule.        Tertiary structure: the overall shape of a single protein molecule; the spatial relationship of the secondary structures to one another.        Quaternary structure: the shape or structure that results from the interaction of more than one protein molecule, usually called protein subunits in this context, which function as part of the larger assembly or protein complex.        
Proteins are not entirely rigid molecules. In addition to these levels of structure, proteins may shift between several related structures while they perform their biological function. In the context of these functional rearrangements, these tertiary or quaternary structures are usually referred to as “conformations,” and transitions between them are called conformational changes.
Protein aggregation is characterized as a misfolded, rigid protein grouping which is considered a prevalent phenomenon throughout the industrial bioprocess. Aggregation is considered a primary mode of protein degradation, often leading to immunogenicity of the protein and a loss of bioactivity. Protein aggregation is of critical importance in a wide variety of biomedical situations, ranging from abnormal disease states, such as Alzheimer's and Parkinson's disease, to the production, stability and delivery of protein drugs. As shown in FIG. 1, protein aggregation, which could be amorphous or fibrillar in nature, starts by one of two different mechanisms: A) self-aggregation, in which the partially-folded intermediates are the immediate precursors for aggregation, and B) hetero-aggregation, in which the aggregation of one protein is mediated by another protein.
The formation of protein aggregates is critical in industrial applications, because it can highly affect the production of protein-based drugs or commercial enzymes, greatly lowering the production yields. That is why the detection and determination of protein aggregates is a key point in the biopharmaceutical industry, as well as, in scientific research. Several methods (some of them patented) have been proposed in the past for the determination of aggregates in mixtures. These prior art methods are either designed for a particular protein or peptide and/or require the addition of a foreign probe and thus, does not represent a generalized method with a universal application to a class of biological molecules. Several spectroscopic techniques have been used, like UV-Vis spectroscopy with the aid of probes, fluorescence also using internal or exogenous probes, similarly near UV circular dichroism (CD), limiting the detection of the aggregate to its immediate vicinity; nuclear magnetic resonance (NMR) could be used to detect protein aggregation by the appearance of band broadening. Sedimentation analysis could also be used to identify the extent of oligomerization as long as the protein of interest has a large enough molar extinction coefficient. Chromatographic techniques such as size exclusion could also detect the presence of protein aggregates. But these techniques may require the use of exogenous probes, large amounts of protein, are time consuming and none allow for the determination of the mechanism of aggregation.