Until recently, it was widely believed that a protein must have a well-defined three dimensional (3D) structure to support its function. For example, the specific binding between an enzyme and its substrate has traditionally been explained using a “lock and key” analogy where only the correctly sized “key” (substrate) fits into the “key hole” (active site) of the lock (enzyme). More recently, a growing number of studies have found that many proteins or regions of proteins are inherently flexible and do not conform to the traditional views of protein structure and function. Such an intrinsically disordered protein (IDP) or region of a protein (IDR) does not necessarily have to assume a unique structure to be biologically active, although some become structured when bound to an interaction partner. In fact, the occurrence of unstructured regions under physiological conditions is common in biologically active proteins. Some estimates suggest that as many as half of all eukaryotic proteins contain long (≥40 residues) disordered regions. IDPs have important biological functions and are involved in numerous processes, including regulation of transcription and translation, cellular signal transduction, molecular recognition, and cell cycle regulation, and many have been associated with a wide range of diseases, such as cancer, neurodegeneration, and diabetes. For example, it has been shown that about 80% of human cancer-associated proteins contain predicted regions of disorder of 30 residues or longer.
Because of their plasticity and conformational adaptability, many IDRs/IDPs are capable of interacting with specific proteins or nucleic acids through sequence motifs. The protein interaction of IDRs, which often are rich in serine, threonine, tyrosine and lysine residues, can be tightly and reversibly regulated by covalent posttranslational modification, such as phosphorylation, acetylation or methylation, and ubiquitination.
The mechanism of IDR-mediated protein-protein interactions is poorly understood because of a lack of information on their structural preferences and dynamic properties at atomic resolution. Due to their inherent dynamics, IDRs are not amenable to structure determination by techniques such as x-ray crystallography and conventional NMR, which require a uniquely folded conformation. These regions are often either removed from expression constructs or proteolytically cleaved prior to generation of crystals for structure determination. Therefore, the binding motifs in their sequences are usually not characterized by using x-ray crystallography. NMR spectroscopy is an alternate way to study the structure and function of protein with residue specific resolution. Conventional NMR techniques used in structural biology are based on observation of protons (1H) or indirect detection of the weaker signals from carbon (13C) or nitrogen (15N) isotopes via protons, which can be detected with much higher sensitivity than hetero-nuclei.
The quality of NMR spectra of IDRs/IDPs is often poor because they exist as dynamic ensembles of conformations under physiological conditions of pH and salinity, in the presence or absence of binding partners. This is schematically illustrated in FIGS. 1A and 1B, which show the effect of conformational exchange between different states, such as bound/closed (panel a) and unbound/open (panel b) states on a simple NMR spectrum (see, FIG. 1A). In the presence of conformational exchange, the line-width of an NMR signal depends not only on the transverse relaxation rates (R2) of a nucleus in each state, but also on the relative magnitudes of exchange rate (kex) and the frequency separation (chemical shift) between the NMR signals of the two states (see, FIG. 1B).
The NMR signal will be broad and difficult to detect when the exchange rate equals or is close to the frequency difference (Δν) between the states (intermediate exchange) (see, FIG. 1B). The line broadening effect is more severe in proton detected NMR spectroscopy because disordered protein segments often sample multiple conformations on a time scale from micro- to milliseconds. One approach to minimize line-broadening is to alter the conformational exchange rate by changing the sample temperature; however, the improvement in the resulting spectrum is often limited under conditions close to physiological ones. Another challenge is that proton-detected NMR spectra are often severely overlapped because of the reduced complexity in amino acid sequences of IDPs and the small chemical shift dispersion in the proton dimension. Finally, rapid solvent exchange of the labile amide protons at physiologic pH and temperature can further contribute to line broadening and signal loss.
Together, sequence redundancies, limited spectral dispersion and unfavorable dynamic properties of IDPs can pose severe challenges for using standard proton-detected NMR techniques in the study of their roles in mediating protein interactions. Consequently, conventional proton-detected NMR spectra of IDPs/IDRs, such as 1H 15N HSQC, are often uninterpretable due to spectral overlap and/or line broadening. Thus, there is a need for improved NMR techniques to be able to probe the highly flexible regions of proteins and their roles in mediating protein-protein interactions.