Various serial links for transmitting encrypted or non-encrypted data are well known. One conventional serial link, used primarily in consumer electronics (e.g., for high-speed transmission of video data from a set-top box to a television set) or for high-speed transmission of video data from a host processor (e.g., a personal computer) to a monitor, is known as a transition minimized differential signaling interface (“TMDS” link). The characteristics of a TMDS link include the following:
1. video data are encoded and then transmitted as encoded words (each 8-bit word of digital video data is converted to an encoded 10-bit word before transmission);                a. the encoding determines a set of “in-band” words and a set of “out-of-band” words (the encoder can generate only “in-band” words in response to video data, although it can generate “out-of-band” words in response to control or sync signals. Each in-band word is an encoded word resulting from encoding of one input video data word. All words transmitted over the link that are not in-band words are “out-of-band” words);        b. the encoding of video data is performed such that the in-band words are transition minimized (a sequence of in-band words has a reduced or minimized number of transitions);        c. the encoding of video data is performed such that the in-band words are DC balanced (the encoding prevents each transmitted voltage waveform that is employed to transmit a sequence of in-band words from deviating by more than a predetermined threshold value from a reference potential. Specifically, the tenth bit of each “in-band” word indicates whether eight of the other nine bits thereof have been inverted during the encoding process to correct for an imbalance between running counts of ones and zeroes in the stream of previously encoded data bits);        
2. the encoded video data and a video clock signal are transmitted as differential signals (the video clock and encoded video data are transmitted as differential signals over conductor pairs);
3. three conductor pairs are employed to transmit the encoded video, and a fourth conductor pair is employed to transmit the video clock signal; and
4. signal transmission occurs in one direction, from a transmitter (typically associated with a desktop or portable computer, or other host) to a receiver (typically an element of a monitor or other display device).
A use of the TMDS serial link is the “Digital Visual Interface” interface (“DVI” link) adopted by the Digital Display Working Group. It will be described with reference to FIG. 1. A DVI link can be implemented to include two TMDS links (which share a common conductor pair for transmitting a video clock signal) or one TMDS link, as well as additional control lines between the transmitter and receiver. The DVI link of FIG. 1 includes transmitter 1, receiver 3, and the following conductors between the transmitter and receiver: four conductor pairs (Channel 0, Channel 1, and Channel 2 for video data, and Channel C for a video clock signal), Display Data Channel (“DDC”) lines for bidirectional communication between the transmitter and a monitor associated with the receiver in accordance with the conventional Display Data Channel standard (the Video Electronics Standard Association's “Display Data Channel Standard,” Version 2, Rev. 0, dated Apr. 9, 1996), a Hot Plug Detect (HPD) line (on which the monitor transmits a signal that enables a processor associated with the transmitter to identify the monitor's presence), Analog lines (for transmitting analog video to the receiver), and Power lines (for providing DC power to the receiver and a monitor associated with the receiver). The Display Data Channel standard specifies a protocol for bidirectional communication between a transmitter and a monitor associated with a receiver, including transmission by the monitor of an Extended Display Identification (“EDID”) message that specifies various characteristics of the monitor, and transmission by the transmitter of control signals for the monitor. Transmitter 1 includes three identical encoder/serializer units (units 2, 4, and 6) and additional circuitry (not shown). Receiver 3 includes three identical recovery/decoder units (units 8, 10, and 12) and inter-channel alignment circuitry 14 connected as shown, and additional circuitry (not shown).
As shown in FIG. 1, circuit 2 encodes the data to be transmitted over Channel 0, and serializes the encoded bits. Similarly, circuit 4 encodes the data to be transmitted over Channel 1 (and serializes the encoded bits), and circuit 6 encodes the data to be transmitted over Channel 2 (and serializes the encoded bits). Each of circuits 2, 4, and 6 responds to a control signal (an active high binary control signal referred to as a “data enable” or “DE” signal) by selectively encoding either digital video words (in response to DE having a high value) or a control or synchronization signal pair (in response to DE having a low value). Each of encoders 2, 4, and 6 receives a different pair of control or synchronization signals: encoder 2 receives horizontal and vertical synchronization signals (HSYNC and VSYNC); encoder 4 receives control bits CTL0 and CTL1; and encoder 6 receives control bits CTL2 and CTL3. Thus, each of encoders 2, 4, and 6 generates in-band words indicative of video data (in response to DE having a high value), encoder 2 generates out-of-band words indicative of the values of HSYNC and VSYNC (in response to DE having a low value), encoder 4 generates out-of-band words indicative of the values of CTL0 and CTL1 (in response to DE having a low value), and encoder 6 generates out-of-band words indicative of the values of CTL2 and CTL3 (in response to DE having a low value). In response to DE having a low value, each of encoders 4 and 6 generates one of four specific out-of-band words indicative of the values 00, 01, 10, or 11, respectively, of control bits CTL0 and CTL1 (or CTL2 and CTL3).
It has been proposed to use the cryptographic protocol known as the “High-bandwidth Digital Content Protection” (“HDCP”) protocol to encrypt digital video to be transmitted over a DVI link and to decrypt the data at the DVI receiver. The HDCP protocol is described in the document “High-bandwidth Digital Content Protection System,” Revision 1.0, dated February 17, 2000, by Intel Corporation, and the document “High-bandwidth Digital Content Protection System Revision 1.0 Erratum,” dated Mar. 19, 2001, by Intel Corporation. The full text of both of these documents is incorporated herein by reference.
Another serial link is the proposed “High Definition Multimedia Interface” interface (“HDMI” link) being developed Silicon Image, Inc., Matsushita Electric, Royal Philips Electronics, Sony Corporation, Thomson Multimedia, Toshiba Corporation, and Hitachi.
A DVI (or HDMI-compliant) transmitter implementing the HDCP protocol asserts a stream of pseudo-randomly generated 24-bit words, known as cout[23:0], during each active period (i.e. when DE is high). In a DVI-compliant system, each active period is an active video period. In an HDMI-compliant system, each active period is a period in which video, audio, or other data are transmitted. Each 24-bit word of the cout data is “Exclusive Ored” (in logic circuitry in the transmitter) with a 24-bit word of RGB video data input to the transmitter, in order to encrypt the video data. The encrypted data are then encoded (according to the TMDS standard) for transmission. The same sequence of cout words is also generated in the receiver. After the encoded and encrypted data received at the receiver undergo TMDS decoding, the cout data are processed together with the decoded video in logic circuitry in order to decrypt the decoded data and recover the original input video data.
Before the transmitter begins to transmit HDCP encrypted, encoded video data, the transmitter and receiver communicate bidirectionally with each other to execute an authentication protocol (to verify that the receiver is authorized to receive protected content, and to establish shared secret values for use in encryption of input data and decryption of transmitted encrypted data). More specifically, each of the transmitter and the receiver is preprogrammed (e.g., at the factory) with a 40-bit word known as a key selection vector, and an array of forty 56-bit private keys. To initiate the first part of an authentication exchange between the transmitter and receiver, the transmitter asserts its key selection vector (known as “AKSV”), and a pseudo-randomly generated session value (“An”) to the receiver. In response, the receiver sends its key selection vector (known as “BKSV”) and a repeater bit (indicating whether the receiver is a repeater) to the transmitter, and the receiver also implements a predetermined algorithm using “AKSV” and the receiver's array of forty private keys to calculate a secret value (“Km”). In response to the value “BKSV” from the receiver, the transmitter implements the same algorithm using the value “BKSV” and the transmitter's array of forty private keys to calculate the same secret value (“Km”) as does the receiver.
Each of the transmitter and the receiver then uses the shared value “Km,” the session value “An,” and the repeater bit to calculate a shared secret value (the session key “Ks”), a value (“R0”) for use in determining whether the authentication is successful, and a value (“M0”) for use during a second part of the authentication exchange. The second part of the authentication exchange is performed only if the repeater bit indicates that the receiver is a repeater, to determine whether the status of one or more downstream devices coupled to the repeater requires revocation of the receiver's authentication.
After the first part of the authentication exchange, and (if the second part of the authentication exchange is performed) if the receiver's key selection vector is not revoked as a result of the second part of the authentication exchange, each of the transmitter and the receiver generates a 56-bit frame key Ki (for initiating the encryption or decrypting a frame of video data), an initialization value Mi, and a value Ri used for link integrity verification. The Ki, Mi, and Ri values are generated in response to a control signal (identified as “ctl3” in FIG. 2), which is received at the appropriate circuitry in the transmitter, and is also sent by the transmitter to the receiver, during each vertical blanking period, when DE is low. As shown in the timing diagram of FIG. 2, the control signal “ctl3” is a single high-going pulse. In response to the Ki, Mi, and Ri values, each of the transmitter and receiver generates a sequence of pseudo-randomly generated 24-bit words cout[23:0]. Each 24-bit word of the cout data generated by the transmitter is “Exclusive Ored” (in logic circuitry in the transmitter) with a 24-bit word of a frame of video data (to encrypt the video data). Each 24-bit word of the cout data generated by the receiver is “Exclusive Ored” (in logic circuitry in the receiver) with a 24-bit word of the first received frame of encrypted video data (to decrypt this encrypted video data). The 24-bit words cout[23:0] generated by the transmitter are content encryption keys (for encrypting a line of input video data), and the 24-bit words cout[23:0] generated by the receiver are content decryption keys (for decrypting a received and decoded line of encrypted video data).
During each horizontal blanking interval (in response to each falling edge of the data enable signal DE) following assertion of the control signal ctl3, the transmitter performs a rekeying operation and the receiver performs the same rekeying operation to change (in a predetermined manner) the cout data words to be asserted during the next active video period. This continues until the next vertical blanking period, when the control signal ctl3 is again asserted to cause each of the transmitter and the receiver to calculate a new set of Ki and Mi values (with the index “i” being incremented in response to each assertion of the control signal ctl3). The Ri value is updated once every 128 frames. Actual encryption of input video data or decryption of received, decoded video data (or encryption of input video, audio, or other data, or decryption of received, decoded video, audio, or other data, in the case of an HDMI-compliant system) is performed, using the cout data words generated in response to the latest set of Ks, Ki and Mi values, only when DE is high (not during vertical or horizontal blanking intervals).
Each of the transmitter and receiver includes an HDCP cipher circuit (sometimes referred to herein as an “HDCP cipher”) of the type shown in FIG. 3. The HDCP cipher includes linear feedback shift register (LFSR) module 80, block module 81 coupled to the output of LFSR module 80, and output module 82 coupled to an output of block module 81. LFSR module 80 is employed to re-key block module 81 in response to each assertion of an enable signal (the signal “ReKey” shown in FIG. 3), using the session key (Ks) and the current frame key (Ki). Block module 81 generates (and provides to module 80) the key Ks at the start of a session and generates (and applies to module 80) a new value of key Ki at the start of each frame of video data (in response to a rising edge of the control signal “ctl3,” which occurs in the first vertical blanking interval of a frame). The signal “ReKey” is asserted to the FIG. 3 circuit at each falling edge of the DE signal (i.e., at the start of each vertical and each horizontal blanking interval), and at the end of a brief initialization period (during which module 81 generates an updated value of the frame key Ki) after each rising edge of signal “ctl3.”
Module 80 consists of four linear feedback shift registers (having different lengths) and combining circuitry coupled to the shift registers and configured to assert a single output bit per clock interval to block module 81 during each of a fixed number of clock cycles (e.g., 56 cycles) commencing on each assertion of the signal “ReKey” when DE is low (i.e., in the horizontal blanking interval of each line of video data). This output bit stream is employed by block module 81 to re-key itself just prior to the start of transmission or reception of each line of video data.
Block module 81 comprises two halves, “Round Function K” and “Round Function B,” as shown in FIG. 4. Round Function K includes 28-bit registers Kx, Ky, and Kz, seven S-Boxes (each a 4 input bit by 4 output bit S-Box including a look-up table) collectively labeled “S-Box K” in FIG. 4, and linear transformation unit K, connected as shown. Round Function B includes 28-bit registers Bx, By, and Bz, seven S-Boxes (each a 4 input bit by 4 output bit S-Box including a look-up table) collectively labeled “S-Box B” in FIG. 4, and linear transformation unit B, connected as shown. Round Function K and Round Function B are similar in design, but Round Function K performs one round of a block cipher per clock cycle to assert a different pair of 28-bit round keys (Ky and Kz) each clock cycle in response to the output of LFSR module 80, and Round Function B performs one round of a block cipher per clock cycle, in response to each 28-bit round key Ky from Round Function K and the output of LFSR module 80, to assert a different pair of 28-bit round keys (By and Bz) each clock cycle. The transmitter generates value An at the start of the authentication protocol and the receiver responds to it during the authentication procedure. The value An is used to randomize the session key. Block module 81 operates in response to the authentication value (An), and the initialization value (Mi) which is updated by output module 82 at the start of each frame (at each rising edge of the control signal “ctl3”).
Each of linear transformation units K and B outputs 56 bits per clock cycle. These output bits are the combined outputs of eight diffusion networks in each transformation unit. Each diffusion network of linear transformation unit K produces seven output bits in response to seven of the current output bits of registers Ky and Kz. Each of four of the diffusion networks of linear transformation unit B produces seven output bits in response to seven of the current output bits of registers By, Bz, and Ky, and each of the four other diffusion networks of linear transformation unit B produces seven output bits in response to seven of the current output bits of registers By and Bz.
In Round Function K, one bit of register Ky takes its input from the bit stream asserted by module 80 when the ReKey signal is asserted. In Round Function B, one bit of register By takes its input from the bit stream asserted by module 80 when the ReKey signal is asserted.
Output module 82 performs a compression operation on the 28-bit keys (By, Bz, Ky and Kz) asserted to it (a total of 112 bits) by module 81 during each clock cycle, to generate one 24-bit block of pseudo-random bits cout[23:0] per clock cycle. Each of the 24 output bits of module 82 consists of the exclusive OR (“XOR”) of nine terms as follows: (B0*K0)+(B1*K1)+(B2*K2)+(B3*K3)+(B4*K4)+(B5*K5)+(B6*K6)+(B7)+(K7), where “*” denotes a logical AND operation and “+” denotes a logical XOR operation.
In the transmitter, logic circuitry 83 (shown in FIG. 3) receives each 24-bit word of cout data and each input 24-bit RGB video data word, and performs a bitwise XOR operation thereon in order to encrypt the video data, thereby generating a word of the “data_encrypted” data indicated in FIG. 3. Typically, the encrypted data subsequently undergoes TMDS encoding before it is transmitted to a receiver. In the receiver, logic circuitry 83 (shown in FIG. 3) receives each 24-bit block of cout data and each recovered 24-bit RGB video data word (after the recovered data has undergone TMDS decoding), and performs a bitwise XOR operation thereon in order to decrypt the recovered video data.
Throughout the specification the expression “TMDS-like link” will be used to denote a serial link capable of transmitting encoded data (e.g., encoded digital video data) and a clock for the encoded data, from a transmitter to a receiver, and also capable of transmitting (bidirectionally or unidirectionally) one or more additional signals (e.g., encoded digital audio data or other encoded data) between the transmitter and receiver, that is or includes either a TMDS link or a link having some but not all of the characteristics of a TMDS link. Examples of TMDS-like links include links that differ from TMDS links only by encoding data as N-bit code words (where N 10, and thus are not 10-bit TMDS code words) and links that differ from TMDS links only by transmitting encoded video over more than three or less than three conductor pairs. Some TMDS-like links encode input video data (and other data) to be transmitted into encoded words comprising more bits than the incoming data using a coding algorithm other than the specific algorithm used in a TMDS link, and transmit the encoded video data as in-band characters and the other encoded data as out-of-band characters (HDMI-compliant systems encode audio data for transmission according to an encoding scheme that differs from the encoding scheme employed for video data). The characters need not be classified as in-band or out-of-band characters based according to whether they satisfy transition minimization and DC balance criteria. Rather, other classification criteria could be used. An example of an encoding algorithm, other than that used in a TMDS link but which could be used in a TMDS-like link, is IBM 8b10b coding. The classification (between in-band and out-of-band characters) need not be based on just a high or low number of transitions. For example, the number of transitions of each of the in-band and out-of-band characters could (in some embodiments) be in a single range (e.g., a middle range defined by a minimum and a maximum number of transitions).
The expression “cryptographic device” is used herein to denote a device that includes a cipher engine, where the cipher engine is configured to use at least one key to encrypt content or to decrypt encrypted content. Typical cryptographic devices that embody the invention are transmitters and receivers. The term “transmitter” is used herein in a broad sense to denote any unit capable of transmitting data over a link and optionally also encoding and/or encrypting the data to be transmitted. The term “receiver” is used herein in a broad sense to denote any unit capable of receiving data that has been transmitted over a link (and optionally also decoding and/or decrypting the received data). Unless otherwise specified, a link can but need not be a TMDS-like link or other serial link. The term transmitter can denote a transceiver that performs the functions of a receiver as well as the functions of a transmitter.
The expression “content key” is used herein to denote data (sometimes referred to herein as a “content encryption” key) that can be used by a cryptographic device to encrypt content (e.g., video, audio, or other content), or to denote data (sometimes referred to herein as a “content decryption” key) that can be used by a cryptographic device to decrypt encrypted content.
The term “key” is used herein to denote a content key, or data that can be used by a cryptographic device to generate or otherwise obtain (in accordance with a content protection protocol) a content key. The expressions “key” and “key data” are used interchangeably herein. The term “key set” denotes one key or more than one key.
The term “stream” of data, as used herein, denotes that all the data are of the same type and are transmitted with the same clock frequency. The term “channel,” as used herein, refers to that portion of a link that is employed to transmit data (e.g., a particular conductor or conductor pair between the transmitter and receiver over which the data are transmitted, and specific circuitry within the transmitter and/or receiver used for transmitting and/or recovery of the data) and to the technique employed to transmit the data over the link.
Until the present invention, cryptographic devices that implement the conventional HDCP protocol (and other cryptographic devices that use “unique” key sets, in the sense that each device uses a key set that is used by no other device, or only by a small number of devices) have been difficult to test. The results of successfully testing each such device (e.g., by encrypting or decrypting known data using the device's key set and/or performing an authentication exchange with the device) will vary from device to device. Also, cryptographic devices that use unique key sets are often implemented as integrated circuits in which the key sets are stored in non-volatile memory (which adds greatly to the cost of manufacturing the devices). Another difficulty in testing an integrated circuit implementation of a cryptographic device (of the type that uses a unique key set) arises because some (or some aspects) of the device's internal states must be kept confidential to avoid releasing cryptographically sensitive information. This limits visibility and thus testability.
Also until the present invention, authentication exchanges between cryptographic devices that implement the conventional HDCP protocol (and other cryptographic devices that use unique key sets) had been time-consuming or had required expensive random access memory (RAM) to store a key set. In a typical integrated circuit implementation of such a cryptographic device, a key set (e.g., a key selection vector to be sent to another device during an authentication exchange, and a private key array to be employed with a key selection vector received from another device to calculate a secret value “Km”) is stored in non-volatile semiconductor memory (e.g., EEPROM or flash memory). During each authentication exchange, the stored key set must be read from the non-volatile memory and stored in RAM within the chip, or the key set read from the non-volatile memory must be employed to perform the necessary calculations on the fly (without storing the key set in RAM). When the key set read from the non-volatile memory was stored in RAM, a large RAM had typically been required which added significantly to the cost and complexity of the device. When the key set read from non-volatile memory was used to perform the calculations on the fly, a read cycle (which is typically very slow) had to be performed every time an authentication was performed (to read the key set from the non-volatile memory). One of the most time-consuming steps of the latter type of conventionally implemented authentication exchange had been the step of reading the required key set from non-volatile semiconductor memory.
It has been proposed to implement a cryptographic device to reduce the time required for HDCP authentication exchanges by reading the required key set from non-volatile semiconductor memory just once (e.g., when the device performs its first authentication exchange) and caching the key set in RAM. Subsequent authentication exchanges employ the cached key set instead of retrieving the key set from non-volatile memory. However, such a technique has the disadvantage of allocating a significant amount of RAM for use only as a key set cache memory, thus reducing the amount of RAM available for other purposes or requiring use of RAM having significantly larger capacity than would otherwise be needed.
The present invention (to be described below) greatly reduces or eliminates the above-noted limitations and disadvantages of the designs and methods of operating conventional cryptographic devices.
The term “HDCP protocol” is used herein in a broad sense to denote both the conventional HDCP protocol and modified HDCP protocols that closely resemble the conventional HDCP protocol but differ therefrom in one or more respects. Some but not all embodiments of the invention implement an HDCP protocol. The conventional HDCP protocol encrypts (or decrypts) data during active video periods but not during blanking intervals between active video periods. An example of a modified HDCP protocol is a content protection protocol that differs from the conventional HDCP protocol only to the extent needed to accomplish decryption of data transmitted between active video periods (as well as decryption of video data transmitted during active video periods) or to accomplish encryption of data to be transmitted between active video periods (as well as encryption of video data to be transmitted during active video periods).