Bayesian networks are defined as directed acyclic graphs whose structures encode the conditional independence and causal relationship between variables (nodes/objects). Bayesian networks are a class of probabilistic models that can be used in multivariate statistical analysis. Bayesian networks have been studied intensively in the past decade to learn casual models under uncertainty and have been applied successfully to different areas of science and technology for knowledge discovery. Bioinformatics, finance, signal processing, and computer vision are a few examples. Discovering unknown Bayesian network structures from data is a challenging problem and is generally considered to be computationally infeasible, except for networks with few variables. Therefore, many methods trade exactness and correctness for faster computation. Moreover, many learning algorithms have been developed based on the Von Neumann computing paradigm and rely on General Purpose Processors (GPP) for execution, and hence are limited in performance by the inherent inefficiencies of GPPs, such as sequential execution and memory hierarchy bottleneck.