There are various, well-known serial links for transmitting video data and other data. One conventional serial link is known as a transition minimized differential signaling interface (“TMDS” link). This link is used primarily for high-speed transmission of video data from a set-top box to a television, and also for high-speed transmission of video data from a host processor (e.g., a personal computer) to a monitor. Among the characteristics of a TMDS link are the following:
1. video data are encoded and then transmitted as encoded words (each 8-bit word of digital video data is converted to an encoded 10-bit word before transmission);                a. the encoding determines a set of “in-band” words and a set of “out-of-band” words (the encoder can generate only “in-band” words in response to video data, although it can generate “out-of-band” words in response to control or sync signals. Each in-band word is an encoded word resulting from encoding of one input video data word. All words transmitted over the link that are not in-band words are “out-of-band” words);        b. the encoding of video data is performed such that the in-band words are transition minimized (a sequence of in-band words has a reduced or minimized number of transitions);        c. the encoding of video data is performed such that the in-band words are DC balanced (the encoding prevents each transmitted voltage waveform that is employed to transmit a sequence of in-band words from deviating by more than a predetermined threshold value from a reference potential. Specifically, the tenth bit of each “in-band” word indicates whether eight of the other nine bits thereof have been inverted during the encoding process to correct for an imbalance between running counts of ones and zeroes in the stream of previously encoded data bits);        
2. the encoded video data and a video clock signal are transmitted as differential signals (the video clock and encoded video data are transmitted as differential signals over conductor pairs without the presence of a ground line);
3. three conductor pairs are employed to transmit the encoded video, and a fourth conductor pair is employed to transmit the video clock signal; and
4. signal transmission occurs in one direction, from a transmitter (typically associated with a desktop or portable computer, or other host) to a receiver (typically an element of a monitor or other display device).
It is foreseeable that a common use for encryption will be to encrypt video data for transmission over a serial link from a set-top box to a television, and it has been proposed to encrypt video data for transmission over a TMDS serial link (e.g., from a set-top box to a television). For example, it has been proposed to use the cryptographic protocol known as the “High-bandwidth Digital Content Protection” (“HDCP”) protocol to encrypt digital video data to be transmitted over the “Digital Video Interface” (“DVI” link) adopted by the Digital Display Working Group, and to decrypt the encrypted video data at the DVI receiver.
A DVI link can be implemented to include two TMDS links (which share a common conductor pair for transmitting a video clock signal) or one TMDS link, as well as additional control lines between the transmitter and receiver. We shall describe a DVI link (that includes one TMDS link) with reference to FIG. 1. The DVI link of FIG. 1 includes transmitter 1, receiver 3, and the following conductors between the transmitter and receiver: four conductor pairs (Channel 0, Channel 1, and Channel 2 for video data, and Channel C for a video clock signal), Display Data Channel (“DDC”) lines for bidirectional communication between the transmitter and a monitor associated with the receiver in accordance with the conventional Display Data Channel standard (the Video Electronics Standard Association's “Display Data Channel Standard,” Version 2, Rev. 0, dated Apr. 9, 1996), a Hot Plug Detect (HPD) line (on which the monitor transmits a signal that enables a processor associated with the transmitter to identify the monitor's presence), Analog lines (for transmitting analog video to the receiver), and Power lines (for providing DC power to the receiver and a monitor associated with the receiver). The Display Data Channel standard specifies a protocol for bidirectional communication between a transmitter and a monitor associated with a receiver, including transmission by the monitor of an Extended Display Identification (“EDID”) message that specifies various characteristics of the monitor, and transmission by the transmitter of control signals for the monitor. Transmitter 1 includes three identical encoder/serializer units (units 2, 4, and 5) and additional circuitry (not shown). Receiver 3 includes three identical recovery/decoder units (units 8, 10, and 12) and inter-channel alignment circuitry 14 connected as shown, and additional circuitry (not shown).
As shown in FIG. 1, circuit 2 encodes the data to be transmitted over Channel 0, and serializes the encoded bits. Similarly, circuit 4 encodes the data to be transmitted over Channel 1 (and serializes the encoded bits), and circuit 6 encodes the data to be transmitted over Channel 2 (and serializes the encoded bits). Each of circuits 2, 4, and 6 responds to a control signal (an active high binary control signal referred to as a “data enable” or “DE” signal) by selectively encoding either digital video words (in response to DE having a high value) or a control or synchronization signal pair (in response to DE having a low value). Each of encoders 2, 4, and 6 receives a different pair of control or synchronization signals: encoder 2 receives horizontal and vertical synchronization signals (HSYNC and VSYNC); encoder 4 receives control bits CTL0 and CTL1; and encoder 6 receives control bits CTL2 and CTL3. Thus, each of encoders 2, 4, and 6 generates in-band words indicative of video data (in response to DE having a high value), encoder 2 generates out-of-band words indicative of the values of HSYNC and VSYNC (in response to DE having a low value), encoder 4 generates out-of-band words indicative of the values of CTL0 and CTL1 (in response to DE having a low value), and encoder 6 generates out-of-band words indicative of the values of CTL2 and CTL3 (in response to DE having a low value). In response to DE having a low value, each of encoders 4 and 6 generates one of four specific out-of-band words indicative of the values 00, 01, 10, or 11, respectively, of control bits CTL0 and CTL1 (or CTL2 and CTL3).
As noted above, it has been proposed to use the cryptographic protocol known as the “High-bandwidth Digital Content Protection” (“HDCP”) protocol to encrypt digital video to be transmitted over a DVI link and to decrypt the data at the DVI receiver. The HDCP protocol is described in the document “High-bandwidth Digital Content Protection System,” Revision 1.0, dated Feb. 17, 2000, by Intel Corporation, and the document “High-bandwidth Digital Content Protection System Revision 1.0 Erratum,” dated Mar. 19, 2001, by Intel Corporation. The full text of both of these documents is incorporated herein by reference.
A DVI transmitter implementing the HDCP protocol asserts a stream of pseudo-randomly generated 24-bit words, known as cout[23:0], during the video active period (i.e. when DE is high). Each 24-bit word of the cout data is “Exclusive Ored” (in logic circuitry in the transmitter) with a 24-bit word of RGB video data input to the transmitter, in order to encrypt the video data. The encrypted data are then encoded (according to the TMDS standard) for transmission. The same sequence of cout words is also generated in the receiver. After the encoded and encrypted data received at the receiver undergo TMDS decoding, the cout data are processed together with the decoded video in logic circuitry in order to decrypt the decoded data and recover the original input video data.
Before the transmitter begins to transmit HDCP encrypted, encoded video data, the transmitter and receiver communicate bidirectionally with each other to execute an authentication protocol (to verify that the receiver is authorized to receive protected content, and to establish shared secret values for use in encryption of input data and decryption of transmitted encrypted data). More specifically, each of the transmitter and the receiver is preprogrammed (e.g., at the factory) with a 40-bit word known as a key selection vector, and an array of forty 56-bit private keys). To initiate the first part of an authentication exchange between the transmitter and receiver, the transmitter asserts its key selection vector (known as “AKSV”), and a pseudo-randomly generated session value (“An”) to the receiver. In response, the receiver sends its key selection vector (known as “BKSV”) and a repeater bit (indicating whether the receiver is a repeater) to the transmitter, and the receiver also implements a predetermined algorithm using “AKSV” and the receiver's array of forty private keys to calculate a secret value (“Km”). In response to the value “BKSV” from the receiver, the transmitter implements the same algorithm using the value “BKSV” and the transmitter's array of forty private keys to calculate the same secret value (“Km”) as does the receiver.
Each of the transmitter and the receiver then uses the shared secret value “Km,” the session value “An,” and the repeater bit to calculate a session key (“Ks”) and two values (“M0” and “R0”) for use during a second part of the authentication exchange. The second part of the authentication exchange is performed only if the repeater bit indicates that the receiver is a repeater, to determine whether the status of one or more downstream devices coupled to the repeater requires revocation of the receiver's authentication.
After the first part of the authentication exchange, and (if the second part of the authentication exchange is performed) if the receiver's authentication is not revoked as a result of the second part of the authentication exchange, each of the transmitter and the receiver generates a 56-bit frame key Ki (for initiating the encryption or decrypting a frame of video data), an initialization value Mi, and a value Ri used for link integrity verification. The Ki, Mi, and Ri values are generated in response to a control signal (identified as “ctl3” in FIG. 2), which is received at the appropriate circuitry in the transmitter, and is also sent by the transmitter to the receiver, during each vertical blanking period, when DE is low. As shown in the timing diagram of FIG. 2, the control signal “ctl3” is a single high-going pulse. In response to the Ki, Mi, and Ri values, each of the transmitter and receiver generates a sequence of pseudo-randomly generated 24-bit words cout[23:0]. Each 24-bit word of the cout data generated by the transmitter is “Exclusive Ored” (in logic circuitry in the transmitter) with a 24-bit word of a frame of video data (to encrypt the video data). Each 24-bit word of the cout data generated by the receiver is “Exclusive Ored” (in logic circuitry in the receiver) with a 24-bit word of the first received frame of encrypted video data (to decrypt this encrypted video data). The 24-bit words cout[23:0] generated by the transmitter are encryption keys (for encrypting a line of input video data), and the 24-bit words cout[23:0] generated by the receiver are decryption keys (for decrypting a received and decoded line of encrypted video data).
During each horizontal blanking interval (in response to each falling edge of the data enable signal DE) following assertion of the control signal ctl3, the transmitter performs a rekeying operation and the receiver performs the same rekeying operation to change (in a predetermined manner) the cout data words to be asserted during the next active video period. This continues until the next vertical blanking period, when the control signal ctl3 is again asserted to cause each of the transmitter and the receiver to calculate a new set of Ki and Mi values (with the index “i” being incremented in response to each assertion of the control signal ctl3). The Ri value is updated once every 128 frames. Actual encryption of input video data (or decryption of received, decoded video data) is performed, using the cout data words generated in response to the latest set of Ks, Ki and Mi values, only when DE is high (not during vertical or horizontal blanking intervals).
Each of the transmitter and receiver includes an HDCP cipher circuit (sometimes referred to herein as an “HDCP cipher”) of the type shown in FIG. 3. The HDCP cipher includes linear feedback shift register (LFSR) module 80, block module 81 coupled to the output of LFSR module 80, and output module 82 coupled to an output of block module 81. LFSR module 80 is employed to re-key block module 81 in response to each assertion of an enable signal (the signal “ReKey” shown in FIG. 3), using the session key (Ks) and the current frame key (Ki). Block module 81 generates (and provides to module 80) the key Ks at the start of a session and generates (and applies to module 80) a new value of key Ki at the start of each frame of video data (in response to a rising edge of the control signal “ctl3,” which occurs in the first vertical blanking interval of a frame). The signal “ReKey” is asserted to the FIG. 3 circuit at each falling edge of the DE signal (i.e., at the start of each vertical and each horizontal blanking interval), and at the end of a brief initialization period (during which module 81 generates an updated value of the frame key Ki) after each rising edge of signal “ctl3.”
Module 80 consists of four linear feedback shift registers (having different lengths) and combining circuitry coupled to the shift registers and configured to assert a single output bit per clock interval to block module 81 during each of a fixed number of clock cycles (e.g., 56 cycles) commencing on each assertion of the signal “ReKey” when DE is low (i.e., in the horizontal blanking interval of each line of video data). This output bit stream is employed by block module 81 to re-key itself just prior to the start of transmission or reception of each line of video data.
Block module 81 comprises two halves, “Round Function K” and “Round Function B,” as shown in FIG. 4. Round Function K includes 28-bit registers Kx, Ky, and Kz, seven S-Boxes (each a 4 input bit by 4 output bit S-Box including a look-up table) collectively labeled “S-Box K” in FIG. 4, and linear transformation unit K, connected as shown. Round Function B includes 28-bit registers Bx, By, and Bz, seven S-Boxes (each a 4 input bit by 4 output bit S-Box including a look-up table) collectively labeled “S-Box B” in FIG. 4, and linear transformation unit B, connected as shown. Round Function K and Round Function B are similar in design, but Round Function K performs one round of a block cipher per clock cycle to assert a different pair of 28-bit round keys (Ky and Kz) each clock cycle in response to the output of LFSR module 80, and Round Function B performs one round of a block cipher per clock cycle, in response to each 28-bit round key Ky from Round Function K and the output of LFSR module 80, to assert a different pair of 28-bit round keys (By and Bz) each clock cycle. The transmitter generates value An at the start of the authentication protocol and the receiver responds to it during the authentication procedure. The value An is used to randomize the session key. Block module 81 operates in response to the authentication value (An), and the initialization value (Mi) which is updated by output module 82 at the start of each frame (at each rising edge of the control signal “ctl3”).
Each of linear transformation units K and B outputs 56 bits per clock cycle. These output bits are the combined outputs of eight diffusion networks in each transformation unit. Each diffusion network of linear transformation unit K produces seven output bits in response to seven of the current output bits of registers Ky and Kz. Each of four of the diffusion networks of linear transformation unit B produces seven output bits in response to seven of the current output bits of registers By, Bz, and Ky, and each of the four other diffusion networks of linear transformation unit B produces seven output bits in response to seven of the current output bits of registers By and Bz.
In Round Function K, one bit of register Ky takes its input from the bit stream asserted by module 80 when the ReKey signal is asserted. In Round Function B, one bit of register By takes its input from the bit stream asserted by module 80 when the ReKey signal is asserted.
Output module 82 performs a compression operation on the 28-bit keys (By, Bz, Ky and Kz) asserted to it (a total of 112 bits) by module 81 during each clock cycle, to generate one 24-bit block of pseudo-random bits cout[23:0] per clock cycle. Each of the 24 output bits of module 82 consists of the exclusive OR (“XOR”) of nine terms as follows: (B0*K0)+(B1*K1)+(B2*K2)+(B3*K3)+(B4*K4)+(B5*K5)+(B6*K6)+(B7)+(K7), where “*” denotes a logical AND operation and “+” denotes a logical XOR operation. FIG. 5 specifies the input values B0-B7 and K0-K7 in the preceding expression for generating each of the 24 output bits of module 82. For example, FIG. 5 indicates that in order to generate output bit 0 (i.e., cout(0)), B0 is the seventeenth bit of register Bz, K0 is the third bit of register Kz, B1 is the twenty-sixth bit of register Bz, and so on.
In the transmitter, logic circuitry 83 (shown in FIG. 3) receives each 24-bit word of cout data and each input 24-bit RGB video data word, and performs a bitwise XOR operation thereon in order to encrypt the video data, thereby generating a word of the “data_encrypted” data indicated in FIG. 3. Typically, the encrypted data subsequently undergoes TMDS encoding before it is transmitted to a receiver. In the receiver, logic circuitry 83 (shown in FIG. 3) receives each 24-bit block of cout data and each recovered 24-bit RGB video data word (after the recovered data has undergone TMDS decoding), and performs a bitwise XOR operation thereon in order to decrypt the recovered video data.
Throughout the specification and in the claims the expression “TMDS-like link” will be used to denote a serial link, capable of transmitting digital video data (and a clock for the digital video data) from a transmitter to a receiver, and optionally also transmitting one or more additional signals (bidirectionally or unidirectionally) between the transmitter and receiver, that is or includes either a TMDS link or a link having some but not all of the characteristics of a TMDS link.
There are several conventional TMDS-like links. One type of TMDS-like link is the set of serial links known as Low Voltage Differential Signaling (“LVDS”) links (e.g., “LDI,” the LVDS Display Interface), each of which satisfies the TLA/EIA-644 standard or the IEEE-1596.3 standard. In each system including an LVDS link, the data are sent on a high-speed differential link with a synchronous clock. There is a single clock line with a four to three duty cycle and several different combinations of data lines depending on the data rate and bit depth. An LVDS link is a serial and differential video link, but the video data transmitted over an LVDS link is not encoded.
Other TMDS-like links encode input video data and other data to be transmitted into encoded words comprising more bits than the incoming data using a coding algorithm other than the specific algorithm used in a TMDS link, and transmit the encoded video data as in-band characters and the other encoded data as out-of-band characters. The characters need not be classified as in-band or out-of-band characters based according to whether they satisfy transition minimization and DC balance criteria. Rather, other classification criteria could be used. An example of an encoding algorithm, other than that used in a TMDS link but which could be used in a TMDS-like link, is IBM 8b10b coding. The classification (between in-band and out-of-band characters) need not be based on just a high or low number of transitions. For example, the number of transitions of each of the in-band and out-of-band characters could (in some embodiments) be in a single range (e.g., a middle range defined by a minimum and a maximum number of transitions).
The data transmitted between the transmitter and receiver of a TMDS-like link can, but need not, be transmitted differentially (over a pair of conductors). Although the differential nature of TMDS is important in some applications, it is contemplated that some TMDS-like links will transmit data other than differential data. Also, although a TMDS link has four differential pairs (in the single pixel version), three for video data and the other for a video clock, a TMDS-like link could have a different number of conductors or conductor pairs.
The primary data transmitted by a TMDS link are video data. What is often significant about this is that the video data are not continuous, and instead have blanking intervals. However, many TMDS-like serial links do not transmit data having blanking intervals, and thus do not encode input data (for transmission) in response to a data enable signal. For example, the audio serial links known as I2S and S/PDIF transmit continuous data.
We shall refer to content protection protocols other than the HDCP protocol as “non-HDCP” protocols. Not only content protection protocols that closely resemble the HDCP protocol (but differ therefrom in one or more respects), but also content protection protocols very different from the HDCP protocol, shall be referred to as “non-HDCP protocols.”
Aspects of the present invention are useful in encrypting and/or decrypting data in accordance with the HDCP protocol. Some such aspects are also useful in encrypting and/or decrypting data in accordance with non-HDCP protocols. Other aspects of the present invention are useful in encrypting and/or decrypting data in accordance with non-HDCP protocols but not in accordance with the HDCP protocol.
FIG. 6 is a block diagram of the functions performed to encrypt (or decrypt) data in a class of typical non-HDCP protocols that can be implemented in accordance with the present invention. The key exchange function of FIG. 6 is responsible for producing a “key” (which is effectively a large, unique number) for use in generating pseudo-random numbers. Generally speaking, it is necessary to produce the same key at both sides of the link (e.g., at both a transmitter which encrypts data and at a receiver which decrypts the encrypted data), so that pseudo-random number generator (PRNG) functions on both sides of the link can have the same seed, and thus produce the same pseudo-random value stream in response to the same seed. There are many ways to perform a key exchange function. A small number of keys can be pre-programmed for later use, but this is cumbersome and severely limits flexibility (and/or security). Keys can be generated (e.g., in both a transmitter and a receiver) in accordance with a predetermined algorithm, with or without external “seeds,” but security will be compromised if the details of the algorithm become publicly known or are reverse-engineered.
Alternatively, a key (or sequence of keys) can be generated elsewhere and delivered to both sides of the link. If each key can be delivered securely to both sides of the link, the protocol will closely approximate a “one-time-pad”, which is the only truly perfect cipher. There are many ways to deliver keys: one employs an external data entry system such as a password, bar code, or smart card; another employs a private and very secure channel between the transmitter and recipient of the key; and another employs a trusted third party (such as a “certificate authority”).
It should be appreciated that when the expression “certificate authority” is used herein, it is used in a broad sense to denote a trusted third party having the capability to perform a particular function, where such function is described in the context in which the expression “certificate authority” is used. The expression “certificate authority” is more commonly used in a narrower sense to denote a trusted third-party agent that issues digital certificates that are used to create or verify digital signatures and public-private key pairs, where the role of such trusted third-party agent is to guarantee that the individual granted the unique certificate is, in fact, who he or she claims to be. Usually, this means that the certificate authority (where this expression is used in its narrower sense) has access to some specific information (delivered or maintained separately) that allows it to confirm an individual's claimed identity.
The key exchange function of FIG. 6 can implement combinations of two or more key exchange mechanisms. The key exchange mechanism used could vary with the application, or the type of content, or the desired use. Or multiple key exchange methods could be used in concert, each delivering some portion of the key. This latter approach has a number of advantages. For example, if one portion of the system fails and a corresponding portion of the key is compromised, overall system security can still be maintained.
Once each side of the link has the key, the key is used to seed a PRNG function (shown in FIG. 6). The PRNG function can be implemented in any of many different ways, most of which fall into two classes: “stream” ciphers; and “block” ciphers. The best choice for implementing the PRNG function depends on the kind of data, how the data is organized, and how much of it there is. A stream cipher is essentially designed to handle streams of bits (or words). Stream ciphers are fast and efficient, but also generally less secure. Block ciphers are designed to handle blocks of data (data organized into large chunks). They tend to be slower and more compute-intensive, but are generally more secure.
The reversible function (indicated as a separate block of FIG. 6) combines the input data values with the pseudo-random numbers generated by the PRNG function, and is reversible in the sense that data encrypted by a first pass through the reversible function will be decrypted (restored to its original state) by undergoing a second pass through the reversible function. It is not strictly necessary to implement the reversible function as a separate block, because either a stream or a block cipher (implementing the PRNG function, and coupled to receive the input data) will often incorporate the functionality of the reversible function block of FIG. 6. The PRNG and reversible function blocks are separated in FIG. 6 for clarity, and because implementing them separately allows the system to have greater flexibility in attaining the desired level of security.
There are many choices for the cipher algorithm implemented by the reversible function, both standardized and proprietary. Examples include DES, Triple DES, and the Advanced Encryption Standard (AES) or any of the AES candidate algorithms. The exact choice will depend on the costs and goals.
The synchronization function is implemented in a manner which depends on whether each side of the link works in unison (or in contrast, whether encryption and decryption occurs at different times), it determines when the reversible function begins to encrypt (or decrypt) data, and it typically also monitors whether the encryption (or decryption) process is working properly.
In cryptography, encryption and decryption most often occur at different times (we will call this “asynchronous” operation). Asynchronous operation requires that some kind of synchronization information is included in the message sent over the link. This can be as simple as designation of where the message starts, or can include special characters or sub-messages. “Instantaneous” links (or those with known or predictable delays) can operate asynchronously, or they can be implemented so that the encryption and decryption occur at the same time in the logical sense (we will call this “synchronous” operation). Synchronous operation uses some mutual, external time reference that signals both sides when to start. DVI links (and TMDS links) are capable of either synchronous or asynchronous operation. Synchronous timing can work in any of several ways, including by prior arrangement (e.g., both sides of the link can use the first full vertical sync pulse as the reference) or by a handshake of some sort (e.g., the link can exchange information and agree that the next sync pulse will be the reference). The handshake can be accomplished by exchanging signals as simple as a single dedicated pulse, a predefined bit location, or other electrical signal. Or it can be accomplished by exchanging signals that are more accurately distinguishable, such as actual messages (collection of bits), pulse-trains or other more complex electrical signals, or combinations of pulses or signals (e.g., on different wires or time slots).
Typically, some re-synchronization method (e.g., periodic re-synchronization) will be necessary and useful. There are at least two different ways to re-synchronize. One includes delivery of new key material to each side, and then reaching a new agreement about when it will be used. Another involves saving “checkpoints” (copies of the system state when the synchronization is known or assumed to be good) and then going back to the last checkpoint when necessary. These and other re-synchronization methods are not mutually exclusive. Indeed, often a combination of approaches or capabilities gives the best performance.
A typical non-HDCP protocol also implements link verification, typically as part of the synchronizing function. Link verification is typically done both to verify that the link is operating properly (and does not need some special intervention), and because some outside agent wishes to verify compliance with some set of rules. Link verification preferably occurs on a continual basis. A compromise is to perform link verification at regular intervals. If the latter approach is used the interval must be short to limit the inconvenience a link break would cause, and to narrow the window of opportunity for any “hacker.”