The present invention discloses a system and associated method for automatically and optimally recursive Parallel Stage-Decoupled Software Pipelining of loop nests in sequential C/C++ programs. Conventional parallelization methods are not effective for optimal decoupled software pipeline parallelization because conventional parallelization methods do not provide at least one or more of the following capabilities of automatic refactoring of sequential source code, automatic identification of pipeline stages, and static pipeline analysis based on program dependence graph taking all replications and coalesces of all pipeline stages into account.