The present invention relates to an apparatus for efficiently and securely transferring blocks of program information between a secure circuit and an external storage device. The program information is communicated in block chains for more robust encryption, execution obfuscation, and to reduce authentication data overhead.
In one embodiment, the program information is encrypted and optionally authenticated in cipher block chains.
In another embodiment, the program information is authenticated and optionally encrypted in block chains. Block chains greatly reduce authentication data overhead. Address scrambling may be used for heightened security.
Re-ordering of fields such as blocks or bytes within each chain, as well as among entire chains, may further be used to provide even more security.
In another embodiment, blocks of program information are provided to the secure circuit to generate a key. The key may be used to decrypt a data transmission.
The invention is particularly suitable for deterring the copying and reverse engineering of proprietary software algorithms, and for securing cryptographic applications such as the descrambling of pay television programs or the like.
The following definitions are provided:
Secure Circuit:
A secure circuit may be a cryptographic integrated circuit (IC) in which no one, not even the owner, has access to the internal buses, registers, and other circuitry contained within the IC. The IC may hold sensitive key, identification, and other data. However, the secure circuit does not have to be defined by the perimeter of an IC. It could be a Personal Computer (PC), for instance, in a network computer executing a program from a shared storage device accessed over a network. The network computer could be accessing a server for running applications real-time. Portions of the applications are communicated piece-meal to the network computers. The network can allow multiple computers to access the same application at the same time. With a PC, the owner might have access to the decrypted and/or authenticated and/or re-ordered program information received. Moreover, a secure circuit may process unencrypted but authenticated data.
Storage Device:
A storage device is a discrete memory component, such as an IC, of various types. However, as in the PC example described above, the storage device could be a mass storage device such as a hard disk drive located locally or remotely. If remotely located, data could be communicated between that storage device and the secure circuit over an Ethernet-like network, or for example, according to the IEEE 1394 standard. Local access to the mass storage device, for example, may be over the PC's ISA, VESA, or PCI data bus or it could even be through a SCSI, serial, or parallel interface. The mass storage device may be accessed by other network computers, or secure circuits. The storage device could also be a Jazz.TM. drive, tape, CD-ROM, DVD, Personal Computer Memory Card Interface Adapter (PCMCIA), smart card, or any other type of mass storage device.
It is possible, for instance, in the case of the network computer, that program information that is read-only is accessed over the network. A local storage device, e.g., memory, that allows read/write capability may be used that is secure for external storage purposes. Therefore, the storage device may be any combination of device types. And, in the case of a networked storage device, the program information may be copied piece-meal to a faster local memory which may be synchronous dynamic memory.
Program Information:
Program information refers generically to any information that is used by the secure circuit in the execution of a program. This may include instructions such as operational codes (op-codes) in machine code, or pseudo code or interpreted code, such as Java.TM.. It may include look-up tables, stored keys, and various temporary data such as intermediate calculations and the state of the secure circuit.
It may even include some or all of the initialization vectors and keys used to encrypt/decrypt or verify/authenticate the rest of the program information in block chains. This can allow the same vector or key information to be encrypted under different keys so that different secure circuits individually or as select groups may gain access to the same program information, and have derived or been delivered different keys.
The information could include key information and data having to do with the nature of how the bytes of a block, blocks of a chain, and chains are stored in the storage device. This might include the order permutation information of the various fields of a chain or chain sequences describe in more detail later.
Hash:
Hash does not strictly denote a one-way function. Although a strict one-way function is a possibility, the function may be reversible under a secret key, or a trap-door one-way function, or be a very simple function such as an XOR operation.
Data Transmission and Cryptographic Processing:
Data transmission is used for text, messages, video, and audio signals of all types. These include but are not limited to text, messages, video, and audio from broadcast and interactive television and radio, program guides, news services, and interactive message traffic over communication channels. The scrambled data transmission may be sent various ways, e.g. via a broadcast, satellite, cable, telephone, or other link, or from a removable mass storage medium such as a Digital Video Disk, tape, Compact Disk (CD), floppy-disk, or other secure circuit, and received by a descrambling receiver, e.g., decoder such a set-top box, player or a personal computer in a consumer's home.
The data transmission could simply be a response to a challenge. The challenge causes the secure circuit to transform the challenge information with some type of cryptographic processing to create an output that verifies that the secure circuit indeed holds certain secret or private keys.
Internal registers in the secure circuit may be incremented or decremented. These values may be computed along with the secret or private keys to calculate the value to output. Such challenge and response techniques are typically used to authenticate the presence of valid secure circuit before a service is granted.
Cryptographic Processing:
This is processing performed by a secure circuit which typically results in the generation of a key. The key may then be used for many things: scrambling and descrambling a data transmission, identity verification by a client or host, etc. The key does not have to always be self contained within the secure circuit. For example, it may be sent out of the secure circuit for verification reasons.
Various problems with prior art schemes are now addressed.
Problem: Various Proprietary Algorithms can be Stolen
Software painstakingly developed at great expense may be trivially copied from external storage devices. The problem is exacerbated by open networks such as the Internet which can allow rapid and far flung distribution of the pirated code.
With the increasing speeds of general purpose processor chips, there is a trend to perform many processing tasks that were once done in hardware in software. The software is communicated through the use of discrete memory components and/or storage devices including mass storage devices. This can allow for quick reconfiguration of the processing system for different applications by simply executing different software. But that trend is hampered by the fact that the software can be easily copied, disassembled, reversed-engineered, and subsequently distributed thereby depriving the developer and/or inventor of the benefit of this intellectual property.
Also, with increasing speed and reliability of networks, e.g. Ethernet going from 10 megabits per second, to 100 megabits per second and so on, it is realistic to implement systems whereby software can be executed real-time over a network. So-called network computers would always be accessing the latest revision of an application loaded on a network based server. Any application in the archives of this server could be accessed quickly. But such servers may be susceptible to someone downloading and storing the entire application, thereby depriving the service provider of on-going revenue. Once downloaded, the software could be easily shared with others.
It would therefore be desirable to make software analysis and reverse engineering, as well as software copying and re-use by general purpose processors more difficult.
Problem: Cryptographic Key Generator
Cryptographic applications typically involve the generation/derivation of a key based on secret or private key information.
A typical cryptographic key generator performs cryptographic processing on data transmissions. Scrambling data transmissions have become increasingly important due to the need to deter unauthorized persons (e.g., pirates) from gaining access to data transmissions. No matter how the data is transmitted or delivered, the cryptographic processing is present to ensure that providers of the data, e.g., the scrambling senders, get paid for the intellectual property they are transmitting. In the case of a communications network, messages may be scrambled to ensure the privacy of messages, and to authenticate both the sender and recipient. It can allow for non-repudiation, to prevent a recipient from later claiming that they did not order the data. Non-repudiation is important to providers because they have a higher expectation of getting paid. No one else has the cryptographic keys necessary to authenticate messages like the bona fide buyer. The data transmission is cryptographically processed, e.g., scrambled, prior to transmission under one or more secret scrambling keys. The cryptographically processed data transmission is received by a cryptographic de-processor (descrambling receiver) such as a set-top box, media player, or a personal computer in a consumer's home.
Typically, the cryptographic processing such as what is done by a descrambling receiver is done in a secure circuit. The secure circuit is provided with the required keys at the time of manufacture or application installation and initialization, and performs a type of processing to grant access to the data transmission. If access is allowed, then the decryption key is derived. When the decryption key is used in conjunction with associated hardware or software decryption module, the data transmission is descrambled, e.g., made viewable or otherwise suitable for the user.
The descrambling hardware or software may be included in a secure circuit such as an application-specific IC (ASIC).
Likewise, the scrambling sender, e.g., a PC in someone's home scrambling information such as credit card numbers for delivery to a merchant over the Internet, uses the required keys loaded at the time of manufacture or application installation and initialization, to derive a key to scramble the sensitive data for transmission.
In the PC example, the scrambling can be done in a software module, but the scrambling may not actually take place in what is considered the secure circuit. The key derived in either case (for scrambling and descrambling) may be output from the secure circuit to the hardware or software scrambling/descrambling module, or it may hold the key internally to the secure circuit--with the decryption module internal to the secure circuit. Preferably, the key is held and the scrambling/descrambling is performed internally to the secure circuit.
If the key is output from the secure circuit, it can be changed very quickly, even several times a second, thereby making its knowledge only of short lived use. The hardware scrambling/descrambling hardware or software module may be located remotely from the secure circuit which derived the key to scramble/descramble the data transmission.
For a PC executing instructions over a network, the secure circuit may be the PC itself, and the descrambling unit could simply be a software module that receives a length and pointer to, for example, a message in internal or external memory, along with the appropriate key, and cryptographic function identifier.
The function performed by the cryptographic processing in the secure circuit could entail message hashing, signing, and signature authentication using publicly known hashing algorithms and public key cryptography.
In both the ASIC case and the PC case above, a microprocessor is typically used for implementing access control, performing hashing, signature verification, signing and authentication functions. This processing verifies that the secure circuit is indeed authorized to decrypt the data transmission. If authorized, the microprocessor then derives the descrambling key for the data transmission. The secure circuit typically has an internal storage device, e.g., memory, for storing descrambling program information for use by the microprocessor, storage for storing the descrambling key data and state of the decoder, and a scratch-pad memory for storing intermediate calculations and temporary data. The state of the descrambling receiver, e.g., decoder, may indicate, for example, whether the decoder is tuned to a particular channel and the channel identifier. The state of the descrambling receiver may also store whether it is authorized to receive the channel, and whether a program tuned, for example, is subscription, pay-per-view, or video-on-demand.
It would therefore be desirable to make pirate attacks against cryptographic key generators executing with external memory more difficult.
Problem: Inflexibility of Using Internal ROM, and RAM Capacity Issues
For an ASIC, the internal memory used by the IC to store program information may be created from read-only memory (ROM), an erasable programmable read-only memory (EPROM), an electrically erasable programmable read-only memory (EEPROM), Flash memory, or a battery-backed random access memory. Typically, the foundry processes for manufacturing ASICs with the smallest geometries and fastest circuits are developed and characterized for ROM- and RAM-based technology initially. EEPROM and Flash capability come at a later time. Therefore, a performance advantage over other technologies may be obtained by designing the ASIC to use ROM- and RAM-based technology. Also, it is easier for VLSI foundries to build devices with ROM and RAM than with EEPROM and Flash because of their simpler design. Therefore, the designer may realize a lower manufacturing cost with ROM- and RAM-based designs.
Creating an internal memory entirely out of battery-backed RAM is generally impractical because a RAM cell, with its ability to allow reading and writing of data, contains many more gates and is typically a much larger structure than a ROM cell, which only allows reading of data. Therefore, such a RAM memory stores far less programming information than a ROM memory of equal physical size.
However, there are drawbacks to storing the programming information in internal ROM since the entire ASIC must be replaced to change the program information. This may be necessary or desirable, for example, to fix a software problem (e.g., bug), or to provide new or customized features for different customers. To achieve this, a new chip must be manufactured with the change in program information. This can be very costly and time-consuming.
Also, no matter how much storage of any type is built into the secure circuit, e.g., an ASIC, it may be too much or too little for any given application. If the storage is larger than required, the price of the secure circuit is higher than necessary. If the storage is smaller than required, then it is either inadequate for the task, or features must be omitted to make the software fit. Rarely is the size of the storage just right.
Accordingly, it would be desirable to provide an scheme for modifying the capacity of a storage device, e.g., the amount of memory, and for easily and inexpensively updating the program information of a secure circuit such as a cryptographic chip. The system should store the program information in a storage device which is external to the secure circuit and provide for efficient and secure transfer of the program information between the storage device and the secure circuit. The transfer of program information should be fast enough, even over a network, to meet code execution requirements. Moreover, the amount of internal storage, e.g. memory required to make the secure circuit operate should be limited. The system may use a limited amount of quickly accessible internal program information which could boot the secure circuit, monitor error conditions, interpret pseudo-code, or handle real-time processing events. However, this internal program information, if stored in an inflexible form, e.g. ROM or read-only CD-ROM, it cannot be changed as easily as externally stored program information.
Problem: Securing External Storage--Authentication Overhead
In the past, various encryption techniques have been used on bytes and blocks. But pirates have employed a variety of "attacks" to break the security of the system. One attack attempts to get the secure circuit to read the encrypted memory and write it out to a clear area where the program information may be captured and then analyzed. An attack of this type actually employs the decryption circuitry itself to decrypt the program information precluding the need to do more extensive analysis.
Another attack tries to break the security of the application itself, by changing the execution of the application in order to make the secure circuit, in this case, in the descrambling receiver, descramble premium services without paying the appropriate subscription fees. To accomplish these and other attacks, the pirates attempt to modify the contents of the external storage device, e.g., memory. And to accomplish this, one technique used is "trialing," where program information in the external storage device is manipulated in a trial and error approach. The pirate does not know which secret key or keys were used to encrypt the program information, but attempts to manipulate the program information in the external storage device until a useful outcome is obtained.
To prevent these and other attacks from being successful, either authentication, stronger encryption, re-ordering of chain fields, or any combination of the above, may be used.
Authentication may be used to verify the origin of the program information. In a system using authentication, the secure circuit will not process program information which is not accompanied by the correct authentication information. Strong prior art authentication is expensive. However, the amount of authentication information must be sufficiently large to provide an adequate level of security. In conventional memory encryption schemes using byte encryption or block encryption, authentication information would be needed with each byte or block which the chip fetches from the external storage device. For a single byte of program information, several bytes of authentication information would be needed to prevent trialing. In other words, the byte would need to be widened to include the additional authentication information. If an eight bit byte of program information were widened to include only 8 additional bits of authentication information, the authentication information could easily be determined by trialing since, with eight bits per byte, there are only 2.sup.8 =256 possible trialing combinations. To provide a security level comparable to the Data Encryption Standard (DES), 56 bits (seven bytes) might be used to provide 2.sup.56 =7.2.times.10.sup.16 possible combinations of authentication information. The authentication information would thus represent (7/(1+7)) or 87% of the overall storage. This amount of overhead data is very inefficient.
With block encryption, several bytes of data are grouped and authenticated in a block. For example, a block size of 8 data bytes may be used. Then, with eight bytes of authentication information, the overhead is still very high at (7/(7+8)) or 47% of the overall storage. This excessive overhead data can severely affect the cost of the overall system by requiring a significantly larger storage device just to handle the authentication information. This is unacceptable with consumer electronic devices such as hand held games, cellular phones, and television decoders which must be manufactured at the lowest possible cost. In particular, the cost of the storage devices are usually a significant limiting factor. Thus, the amount of authentication information overhead is unacceptably large with existing data authentication schemes.
Accordingly, it would be desirable to have a system which minimizes the amount of authentication information (e.g., check bits) which is required to securely communicate program information.
Problem: Encryption of Program Information Inadequate
Trialing attacks of a single encrypted byte of program information is trivial to perform. Assuming an 8 bit byte again, this requires the trialing of only 2.sup.8 =256 possibilities for the program information to obtain an exact result. For some pirate attacks, however, the ability to simply change program information to something different, is a goal. In this example then, simply the ability to trial a single byte value without influencing other bytes would result in a successful pirate attack.
Trialing attacks of a single encrypted block of program information is a bit more difficult but still manageable. Large general purpose Reduced Instruction Set Computing (RISC) processor, for example, have instructions that are 64 bits long. Assuming and 8 byte block and 8 bits per byte, it is relatively easy for a pirate to alter a block of program information and effect only one instruction.
Even with instruction widths half that size, e.g., 32 bits, only two instructions are affected. So called Complex Instruction Set Computing (CISC) processors are equally at risk for attack. And CISC processors described as "8 bit processors" are not really 8 bits because they typically require the fetching of one, two, or three operands of program information which makes any instruction have between 8 and 32 bits, with an average of about 20 bits, but this depends on the choice of instruction used by the program. Therefore, trialing an 8 byte block of encryption values for so call "8 bit" instructions might only effect three instructions.
Accordingly, it would be desirable to have a more robust encryption algorithm to securely communicate program information.
Problem: Execution, Even Encrypted, is Observable
Even though blocks of program information may be encrypted or authenticated, someone observing the traffic of data on a communications means, e.g. bus or network, can learn about the function and design of the program information. The more information that a pirate might learn about the program information, the more ways that he might have to alter program execution. An internal storage circuit such as a cache may obfuscate some of the function and design by referencing data that was either only decrypted, decrypted and authenticated, or simply authenticated from the internal storage circuit, rather than have to fetch the program information again externally.
A problem arises, however, because the original communication sequence, that which loaded the program information into the cache in the first place, may be observed. A system without a cache is even easier to analyze because recursive code, e.g., loops, can be seen on the external interface. It would be easy to see the same encrypted, encrypted and authenticated, or simply authenticated program information being communicated over and over again. A cache will blind this operation by making the communication internal to the cache and not visible on the communication means. However, a clever pirate might notice that no external communication was occurring and conclude that therefore some sort of internal operation was occurring. In principle, it is not desirable to have a pirate learn anything about the algorithm being executed. This includes the overall structure such as byte to block, block to chain or chain to program information sequence association, sequence of processing such as always executing particular program information on boot-up, and the organization of the program information such as data table organization.
It would therefore be desirable to have techniques for obfuscating the execution of encrypted, authenticated, or any chain of program information. It would be desirable to communicate the program information in a manner which is out-of-sequence from the true execution sequence by the secure circuit. The sequence may be obfuscated within a block, chain, or program information sequence.
That is, it would be desirable to obfuscate the sequencing of the bytes that make up a block, the blocks that make up a chain, and the chains that make up a program information sequence. The sequence permutation may be fixed and yet be different on a byte by byte, block by block, chain by chain, or program information sequence basis. It would be desirable to spread the sequence obfuscation to be of greater depth, that is, greater than a block, for instance, over two blocks or for that matter an entire chain. The same would be desirable for all of the other fields.
Problem: Sequence Permutation Algorithm may be Discovered
Any sequence permutation algorithm implemented in hardware may be discoverable by a pirate probing the VLSI or other analysis. The permutation function may be keyed and be both address and unit dependent. However, this does not preclude a determined pirate from discovering what the key and dependencies are.
It would be therefore also desirable to have a way of making analysis and reverse engineering of the sequence permutation more difficult.
Problem: Underlying Sequence Does not Change--Address Location Always the Same
Even with the sequence permutation, a pirate may observe every communication between the storage device and know which bytes belong to which blocks, and which blocks belong to which chains. That is, a particular address location in the storage device is associated with a particular byte, block, or chain sequence. The address location will always contain the same information. The pirate may not know what the exact positional information is because of the sequence obfuscation, but he knows that its association with the other bytes, blocks or chains is fixed. The pirate does not need to know what the value of the program information stored at a particular location is. The pirate can trial a value at that storage location. The pirate can do this systematically going through all values even though the storage location is accessed at varying times due to the sequence permutation techniques.
It would therefore be desirable therefore, to have a scheme for dynamically changing the address location in the storage device where data representing a particular byte, block, or chain sequence is located in the storage device to prevent someone from systematically trialing code.
Problem: Every Communication is Pertinent
A pirate may observe every communication of program information between the storage device and know that it is encrypted, authenticated, sequence permuted or all of the above.
For additional obfuscation, it would be desirable to communicate "dummy" or not necessarily needed data with the program information communicated.
Problem: Bi-directional Write and Read Required
The storage device can be read-only, but there are many reason why the storage device should also be write-able. Different cryptographic and non-cryptographic yet proprietary applications have varying requirements for data storage.
Modern cryptographic applications often employ public key cryptography, which generally require larger keys than secret key cryptography. The scrambling sender or descrambling receiver may perform some type of cryptographic application which may interface on an open network such as the Internet, which may require the storing of a number of various public keys, e.g., from a Root Authority, or Certificate Authority. Also, with pay television decoders, there are public keys for the access control system and/or the decoder manufacturer. Over time, many more public keys may need to be stored as a result of interacting on the network. Some of these keys are meant to be long lived, and, for example, if the public keys may be 2048, 4096 bits, or larger. Consequently, a large capacity storage device, e.g., large amount of read/write storage may be required for storage of keys and other related information to effect a viable cryptographic application.
The same can be said for many proprietary applications. The trend is to process more and more data. It is desirable to have great flexibility with the type and amount of storage for writing and later retrieval of program information as there is for just reading program information.
Accordingly, it would be desirable to have a secure bi-directional communication between an external storage device and a secure circuit, where this has the flexibility to accommodate growing requirements for additional program information storage without requiring a design change of the secure circuit. Also the security of the overall implementation cannot be diminished.
Problem: Communication with Non-Secure Outside World and Alternative Security Modes
The secure circuit may have to interface with display devices, peripherals or computers which do not have a decryption means. This is important where interactivity with a human is involved. For example, if a customer input a Personal Identification Number (PIN) code wrong, it may be necessary for the secure circuit to inform the customer of the problem so that the PIN may be re-entered. This may require communication with the host device of a error condition or of an error message which may be displayed appropriately on a screen. There may be a shortage of pins, communication ports, or buses which may be dedicated to external communication.
The execution of some program information may have reduced execution latency requirements requiring an alternate communication mode other than by chains. Also, the secure circuit may need to inter-operate with other devices with have different security schemes.
It would also be desirable to provide a conditional clear mode whereby no encryption/decryption, authentication generation/verification, or sequence permutation of the program information is performed. This conditional clear mode would not only allow a possible chip debug facility, but also allow the secure circuit to interface, send and receive clear data, with the world at large, such as display devices, other computers, and the like, thereby allowing the communications means to be used for more than the conveyance of program information. This would reduce the number of separate pins, communication ports, and buses used for external communication.
It would also be desirable to switch off the chain encryption/decryption, authentication generation/verification, or sequence permutation of the program information, in favor of a different type of encrypt/decryption, authentication/verification, or sequence permutation that is not based on chains. For example, instead of a chain, byte or block processing may be used.
Problem: Detection of Chain Lengths
A pirate may be able to analyze the execution of the program information to determine what program information belongs with a particular chain. That knowledge could allow a pirate to trial program information in a more selective fashion. In principle, it is a good idea to prevent a potential pirate from learning anything about how the program information is executing.
It would therefore be desirable to communicate blocks of program information with variable chain lengths in random sequence from one chain to the next with no particular consideration being given to the program information being executed.
Problem: Different Latency Requirements
Real-time interrupt subroutines have different execution latency requirements than background or maintenance routines. There is a natural tendency for a designer to make shorter chains for all of the program information to simply handle the faster execution requirements of real-time interrupt subroutines. But reducing chain lengths for all of the program information may unnecessarily increase the storage capacity of the storage device to accommodate the increased amount of authentication information.
It would therefore be desirable to communicate blocks of program information and associated authentication information in block chains, where different chains lengths may be used for communicating different types of program information with different latency requirements. Routines placed in lower address locations could have lower latency, while those in a higher address location of a storage device could have higher latency requirements.
Problem: General Communication/Storage Latency Requirements
While certain routines may have special execution latency considerations, the latency may still be too much for certain applications. Consequently, means must be explored to allow for more efficient communication and storage of program information.
It would be desirable to design certain features into the architecture of the communication means, and secure circuit in order to help reduce program information latency to help speed up execution.
Problem: Authentication/Verification Latency Requirements
While certain routines may have special execution latency considerations, the latency due to authentication/verification may still be too much for certain applications. Consequently, means must be explored to allow for more efficient authentication/verification.
It would therefore be desirable to design certain features into the authentication/verification function to help reduce program information execution latency.
Problem: Encryption/Decryption Latency Requirements
While certain routines may have special execution latency considerations, the latency due to encryption/decryption may still be too much for certain applications. Consequently, means must be explored to allow for more efficient encryption/decryption.
It would be therefore be desirable to design certain features into the encryption/decryption function to help reduce program information execution latency.
The present invention provides a system having the above and other advantages.