The rapid growth in Internet usage has increased the dependency on information stored and communicated by businesses and individuals. In particular, growth in DSL and cable modem usage by consumers and businesses and increased business-to-business Internet activity have contributed to this dependency. As the desire for confidentiality, authenticity, and integrity increases, an increasing proportion of this information is sent in secure or encrypted form. Also, an increasing proportion of electronic communication will occur at increasingly fast speeds.
Secure communications are desirable for sensitive activities like on-line financial transactions or the transmission of personal medical information, but can require significantly increased processing demands at both ends of a communications session. This processing demand is further increased as communication bandwidth improves and increases the volume of data for security processing. As the demand for secure Internet communication increases, security processing needs consume ever increasing proportions of the available central processing capability of communications network servers.
In secure Internet communication, for example, Internet Protocol (IP) communication servers encrypt, decrypt, sign and authenticate inbound and outbound data packets to accomplish typical IP communication. Cryptographic processors and other devices accomplish or share some of the cryptographic processing load such as the encrypting, decrypting and authenticating of data packets.
Many encryption algorithms utilize the ability to permute or mix incoming data by remapping the data to the output, thereby changing the data orientation. Used to scramble data, permutation takes an input data pattern of 1's and 0's, for example, and switches the position of those bits of information for output. The permutation can map a single input signal to a single output, multiple outputs or no outputs, and the output location(s) of this mapping can be any location(s) within the output data word.
FIG. 1 depicts an example of a data mapping of a permutation. As shown in the Figure, a source data register 102 includes incoming input signals 104 to be permuted, and the permutation 106 maps the signals of the source data register 108 to different outputs 110 in the output data register 112. Any one of the input signals 104 can be mapped to go to any one or more, or none of the outputs 110 on the output data register 112. For simplicity, numerals on the Figure are shown on only one of each component, e.g., inputs 104, outputs 110.
One conventional implementation utilizes a multiplexer architecture, an array of multiplexers, that allows any input bit location to map to any other output bit location as well as any multiple output bit locations. While direct, the multiplexer approach results in a routing intensive implementation and shifts the requirement for flexibility and reconfiguration to the control logic which controls the multiplexers. Additionally, implementations using multiplexers quickly grow as the input source data width increases. As the size of the array grows to support a wider input source data width and more input bits, the number of signal lines and control lines increases. Furthermore, the capacitive and resistive loading on both input bit signal lines and control lines increases in order to support the larger interconnect structure of the multiplex inputs and outputs. Increased capacitive and resistive loading affects the overall speed, i.e., the delay from source data to output data, of the multiplexer permuter. The increase in the physical size of the multiplexer implementation consumes silicon area and complicates the control logic function to allow the implementation of the permutation. Routing of the various control and signals increases the complexity of routing the permutation block, increasing the overall silicon area consumed and thereby the overall cost of the implementation.
FIG. 2 depicts an exemplary conventional multiplexer. As shown on the Figure, the multiplexer has input lines A, B, C, and D which each input a single bit of data, and output lines W, X, Y, and Z which output a single bit of data. It also has control lines for each combination of input-to-output line pairing, A2W, A2X, A2Y, BZ, B2W, B2X, B2Y, B2Z, C2W, C2X, C2Y, C2Z, D2W, D2X, D2Y, and D2Z. As can be seen, an increase in the number of input and output lines greatly increases the number of control lines. In this implementation, in addition to providing the control lines A2W-D2Z, control logic (not shown) is needed to govern the control lines to steer the data bits appropriately. One drawback is that a control line and associated control logic is needed for every possible combination of input and output lines that data can be steered to and from, which grows quickly as the input data word width grows. On FIG. 2, an addition of a single input line and output line increases the number of control lines by 16. The addition of yet another single input line and output line increases the number of control lines by another 32 lines, for a total increase of 48 lines. Furthermore, changing the control logic to map the input bits to output bits becomes increasingly complicated.
Accordingly, there is a desire to avoid the complexity, size, inflexibility and reduced speed of conventional permutation implementations. Methods and systems are desired to avoid these and other related problems.