Arithmetic coding processes such as JPEG2000, JPEG, On2, or H.264 often use Context-based Adaptive Binary Arithmetic Coding (CABAC). The original principle of binary arithmetic coding is based on recursive subdivision of the interval width Range. [For a full description of the H264 CABAC standards and details see ITU-T Series H: Audiovisual and Multimedia Systems Infrastructure of audiovisual-coding of moving video]. Given the estimation of probability pLPS of Least Probable Symbol (LPS), the interval is subdivided into two subintervals: one interval width rLPS =Range pLPS which is associated with the LPS, and the other interval width rMPS=Range−rLPS, which is assigned to the Most Probable Symbol (MPS). Depending on whether the observed bit to be encoded is MPS or LPS, the corresponding subinterval is chosen as the new interval. The binary arithmetic coding process keeps updating the interval width register Range which marks the range of the interval and the code register Value which marks the lower bound of the interval. According to H.264 CABAC process, the Range pLPs required to perform the interval subdivision is approximated using a 4×64 2-D pre-stored table. Range value is approximated by four quantized values (2-bits) using an equal-partition of the whole range 28≦Range≦29 and the value of pLPS is approximated by 64 quantized values indexed by a 6-bit MPS or LPS state. If the code offset (Value) is less than the current Range, the MPS path is taken where the most probable symbol (MPS) is designated as the next output bit, and the state transition is preformed based on the most probable symbol (MPS) look-up table. If Value is greater than current range, the LPS path is taken where the MPS bit is inverted, the current Value is determined from the previous Value and the range, then range becomes rLPS. If the current LPS state equals zero, the MPS is inverted, and the state transition is performed based on the least probable symbol (LPS) look-up table, followed by the renormalization process where the range and value are renormalized. Range is renormalized to the [511,256] interval by left-shifting range the required amount of bits; the Value is scaled up accordingly and the lower bits are appended from the incoming bit stream. One approach suggested in co-pending application U.S. patent application Ser. No. 11/527,001, filed Sep. 26, 2006, entitled “Iterative Process with Rotated Architecture for Reduced Pipeline Dependency” (AD-473), and co-pending U.S. patent application Ser. No. 11/788,094 filed on Apr. 19, 2007 entitled “A Programmable Compute System for Executing an H.264 Binary Decode Symbol Instruction” (AD-505J), each of which are incorporated by reference herein uses three compute units to solve the algorithm in a single instruction or two compute units with two instructions. While that was a significant improvement, it still required significant power and area if three compute units are used or twice as many MIPS (Million Instructions Per Second).