1. Field
The present disclosure relates to computational chemistry, and more particularly to in silico prediction of chemical reactions by designing and processing reaction rule pipelines.
2. Description of the Related Art
Identification of novel pathways for synthesis or degradation of molecules remains a challenge in synthetic chemistry, drug discovery, and biotechnology. To this end, assessment of all potential precursors and associated chemical transformations from a starting molecule leading to a target molecule is required. However, exhaustive screening of all possible chemical transformations is experimentally intractable, and thus, computational chemistry for investigating the behavior of atoms and molecules through computer simulations has been studied.
Identification of novel chemical pathways regarding chemical reactions using a computer is called in silico identification and requires two components: a reaction rule library and a knowledge based reaction prediction system.
The reaction rule library may include rules for chemical conversion. A rule for chemical conversion may represent a chemical transformation and may include necessary information that describes the conversion of a reactant into a product based on a chemical transformation. These rules are either derived from a set of known chemical reactions or may be constructed from basic chemical principles.
A knowledge-based reaction prediction system applies rules from the reaction rule library on an input and predicts a set of products or precursors. To generate multi-step pathways, the rules may be iteratively applied on predicted products and/or precursors. In order to select an experimental tractable synthetic pathway, appropriate start and end molecules may be obtained via the knowledge-based reaction prediction system.
Generally, the knowledge-based reaction prediction system involves a sub-graph alignment to identify a functional group pattern on input molecules, which is computationally intensive. Various physio-chemical properties may also be computed to ascertain the possibility of an input molecule undergoing a reaction transformation. These computations are performed once for each rule on each input and account for most of the computational time in a reaction prediction process. Thus, in the case of a large set of reaction rules and iterative computations, the system may take a long time to predict pathways. Also, the computational intensiveness increases exponentially with each iteration, which restricts higher order simulations.