Embodiments of the invention relate to data flow optimization, in particular, for global data flow optimization for machine learning programs.
Compilers for large-scale machine learning rely on a local optimization scope of individual basic blocks. This ensures simple and efficient compilation. Many real-world machine learning programs, however, exhibit deep control flow structures. In this context, local optimization misses major optimization potential due to missing information and limited optimization freedom.