An encryption engine for performing the American National Standard Institute (ANSI) data encryption standard (DES) algorithm encipher and deciphers blocks of data, typically 64 bits (bit packet) using a key. Deciphering is accomplished using the same key that was used for encrypting but with the schedule of addressing the key bits altered so that the deciphering is the reverse of the encryption process. A block to be encrypted is subjected to an initial permutation, IP, and then to a complex key—dependent computation, and finally to a permutation IP−1 that is the inverse of the initial permutation. The key-dependent computation can be simply defined in terms of a function, f, called the cipher function. For example, after the initial permutation IP, the 64 bit data block is split into to 32 bit data blocks LO and RO. The permuted input block is then input to the cipher function f, which operates on two blocks, one of 32 bits and one of 48 bits. In performing the f function RO is subject to expansion permutation E, resulting in a 48 bit block which is X-ORed with a 48 bit key, the result of which is condensed from 48 bits back to 32 bits using eight selection functions S1-S8, then subjected to permutation P that provides the cipher 32 bit output. The output form the last cipher function is submitted to the reverse initial permutation IP−1. These functions and permutations are normally done in hardware such as application specific integrated (ASIC) circuits which are inflexible: they are dedicated to the specific functions and permutations designed into them. Software implementation would be advantageous because it would allow easy adaptations to emerging standards. However, in software, the increase in cycle time measured in mega instructions per second [mips] is prohibitive. To permute a single bit in a conventional controller or digital signal processor (DSP) three instructions are needed: extracting the bit (AND), shifting the bit to the right position and deposit (OR). Thus, just to accomplish permutation E (48 bits) and permutation P (32 bits) will require 240 cycles, plus at least three instructions per look-up in the eight selection functions (S1-S8) which will require an additional 24 cycles for a total of 264 cycles to process one cipher function. In DES there are sixteen f functions to be performed i.e. 16×264=4,224 [cycles/bit packet] and in triple DES there are forty-eight to be performed i.e. 48×264=12,672 [cycles/bit packet]. Given a 10 megabit data stream coming over the internet or other data source which results in 10×106/64=156250 [bit packets/second]. Thus, 4,224×156250=660 [Mips] for a DES, and 1980 [Mips] for 3DES. For faster data input systems, e.g., modems at 40 megabit/second the time required is 7920 [Mips] all well beyond current processor capabilities.