1. Field of the Invention
The present invention generally relates to a computer-implemented system for analyzing generic bond-bending networks in three dimensions, and more particularly to a computer-implemented system and corresponding method for analyzing the rigidity of substructures within a macromolecule.
2. Discussion of the Related Art
The study of interactions that determine the rigidity and flexibility of macromolecules has been used to probe problems in biochemistry relating to proteins. It is helpful to be able to determine the mechanical properties of a protein molecule in order to analyze and solve problems such as protein folding. The ability to map out the rigid clusters of a protein molecule provides invaluable insight into the structure of the protein that is useful in understanding the molecule's functionality. For example, this information can be used to help assess whether a protein can bind with a ligand. In addition, the ability to acquire this information quickly is important because it can be used as a precursor for numerical methods that make use of predefined mechanical properties.
In the past, attempts have been made to identify rigid regions within proteins. Known procedures, however, are either imprecise or computationally prohibitive with a typical size protein. For example, to count the number of internal degrees of freedom within a macromolecule, the rank of a dynamical matrix may be determined using a matrix diagonlization method. In order to identify rigid clusters, the rank must be recalculated a number of times in proportion to the square of the number of atoms in the molecule, making this procedure very time-consuming.
The present techniques used to infer the location of rigid and flexible regions in a protein molecule include using molecular graphics to visually analyze experimental structural data obtained from x-ray crystallography and nuclear magnetic resonance (NMR), and analyzing limited proteolysis experiments. Other objective computational methods use experimentally determined protein structural data, such as that archived in the Brookhaven National Protein Data Bank (PDB). Large domains or flexible linkages are found based on empirical criteria, such as, the degree of packing or protrusions. Additional methods compare different experimentally observed conformations of the same protein, and are thereby dependent on the availability of observed multiple conformations.
Other direct numerical methods, such as molecular dynamics (MD) and Monte Carlo (MC) simulation, are also used to identify essential degrees of freedom governing the low-energy conformational changes in proteins, which correspond to flexible or floppy regions. These methods are also very time-consuming and subject to numerical inaccuracy. Thus, the presently available methods include the fundamental problem of not being able to quickly and accurately identify the floppy inclusions and rigid substructures in proteins and other macromolecules.
Combinatoric algorithms currently exist which can be used to obtain information for generic two-dimensional central-force networks, such as number of degrees of freedom, independent and redundant constraints, rigid clusters, collective floppy motions, and overconstrained regions. Combinatoric algorithms employ integer arithmetic, rather than floating point arithmetic, to solve a problem. These algorithms, however, are only applicable in two-dimensional systems and, therefore, are not of much practical value.
The two-dimensional combinatorial algorithms are based on Laman's theorem, which provides a complete graph-theoretic characterization of generic rigidity. The theorem generally states that in two dimensions generic rigidity within bar-joint networks may be determined by applying constraint counting to all possible combinations of sites. Generic rigidity, synonymous with graph rigidity, involves only the network connectivity. By applying Laman's theorem recursively, redundant and independent constraints can be identified, as well as rigid and overconstrained regions. The theorem, however, fails in higher dimensions. There thus exists a need in the art for a computational efficient and more precise method of determining the rigid substructures of large macromolecules in three dimensions.
The present invention provides computer-implemented systems and methods that are applicable in three-dimensional systems and may be used to determine various mechanical properties of large macromolecules, such as independent degrees of freedom, independent and redundant constraints, rigid clusters, collective floppy motions, overconstrained regions, and hierarchical characterization. The information obtained from these applications may be applied to determine what forces stabilize or destabilize protein structures under various conditions, which substructures of a protein are rigid or flexible when the protein is in solution or a crystalline lattice, or which substructures of a protein are rigid or flexible when the protein interacts with another molecule, such as a ligand. The present invention provides a means of evaluating protein domains and conformational flexibility for drug design and protein engineering by applying concepts from graph theory to protein structural analysis.