With regard to recent processes of new drug development, a virtual screening technology using a computer is being established in the related fields as a means to reduce time and effort. The background circumstances may include a significant increase in the amount of information relating to structure-activity relationship being accumulated by high throughput screening or combinatorial synthesis in addition to the marked improvement of calculator functions, a considerable increase in the information on the structures of target proteins due to the advances in genome-associated studies, etc.
Examples of such virtual screening technologies may include a ligand-based virtual screening, which is based on structural similarities between compounds that are traditionally known to have an activity on target proteins, i.e., the already-known information relating to structure-activity relationship; and a structure-based virtual screening like a protein-ligand docking using the conformational structure of target proteins.
The structure-based screening method is based on the concept that, in a case where multiple drugs bind in the vicinity of the active site of a target protein, the change in the amount of free energy in each binding process exhibits the strength and weakness of pharmaceutical activities, while any of the drugs is simultaneously in complementary relation with the protein. The structure-based screening method has advantages in that it can estimate the binding state between a target protein and a ligand and a pharmacological activity value thereof by computer, and can also anticipate an activity value with high accuracy prediction while not necessitating the information on structure-activity relationship. Although the method can distinguish true ligands from non-ligands, it is almost impossible to rank the quantitative order, and most virtual screening/docking programs have a limitation in that the flexibility of proteins cannot be considered. Additionally, a receptor structure (binding model) is essential, and the accuracy of prediction depends on the accuracy of structure. Furthermore, these programs also have a limitation in that the accumulation of structure-activity relationship is not linked to the improvement of the accuracy of prediction.
Meanwhile, unlike the structure-based screening method, the ligand-based virtual screening method, whose concept is based on the finding that a homology can be observed between the physicochemical parameters of the drugs that bind to the common areas, has an advantage in that it does not require a receptor structure (binding model). However, the ligand-based virtual screening method also has limitations in that the prediction beyond the already-known information is impossible, or that the accuracy of activity value prediction is low because the method requires advance information on pharmacological activities and the prediction accuracy depends on the quality and amount of the advance information.
The present inventors have made many efforts to overcome the limitations of the existing virtual screening technology described above, and as a result, they have confirmed that a virtual screening method, which does not use the information on the structures of target proteins or compounds or structural attributes thereof, unlike the existing methods, but instead performs a virtual screening based on various biological activities extracted from multiple drug screening data, thereby not only being capable of providing high prediction accuracy but also screening those compounds which have backbones entirely different from those of the existing compounds with known activities, can be provided, thereby completing the present invention.
(Patent Document 1) U.S. Pat. No. 6,421,612
(Patent Document 2) U.S. Pat. No. 6,994,473
(Patent Document 3) U.S. Pat. No. 7,416,524