Superscalar processors leverage Instruction Level Parallelism (ILP) to increase sequential performance by concurrently executing several independent instructions from the same control flow. However, ILP is limited because of control and data dependencies. Control dependencies have long been handled using speculation on conditional (and even indirect) branch instructions to predict the direction that is likely to be taken upon execution. Typically, conventional branch prediction algorithms rely on a limited dynamic history of the past-taken and not-taken branches and use it as a prediction generating metric. These branch prediction units are coupled to the front-end and drive the instruction fetching logic. While in some instances, conventional branch predictors can be effective, they are still limited by the expensive area required for caching of history and the extent of history used for correlating branches.