The expression “use restriction set” (or “Use Restriction Set”) herein denotes the set of all use restrictions to which content (of a specific type) is subject. The use restriction set for specific content can include any number of use restrictions (e.g., one use restriction or many use restrictions). For example, a use restriction set for video and audio data that define a movie could prohibit transmission of the data outside a specified location (e.g., a single device or network) without proscribing any use of the data within the location. For another example, a use restriction set for video and audio data that define a movie could prohibit all uses of the data except for a single viewing of the movie (a single viewing of the video data and playback of the corresponding audio data) at a specified location (e.g., a single viewing and playback by a specific device or a device set of specified type, or a single viewing and playback by any device of a specified network).
The invention pertains to methods and apparatus for content protection in a personal digital network environment, where “personal digital network environment” (“PDNE”) denotes the environment defined by a “personal digital network.” The expression “personal digital network” (“PDN”) herein denotes a network (of components, each comprised of some combination of hardware and optionally also software or firmware) capable of receiving content (e.g., digital image data, video data, or audio data) subject to a Use Restriction Set and configured to use the content in at least one way (and optionally, in many or all ways) not forbidden by the Use Restriction Set. An example of a PDN is a network installed in the network user's home, which includes digital video (and audio) storage, rendering (i.e., display or playback), and processing devices, and a personal computer (or other computing system having open architecture) that is capable of communicating with or controlling such devices. An example of a simple PDN is a computing system having open architecture (e.g., a personal computer with peripheral devices) that is configured to receive encrypted video and audio content (e.g., by reading the content from a high definition DVD or other disc) and to display a video portion of the content and play back an audio portion of the content. The content entering a PDNE can but need not be video or audio data, and can be or include data indicative of any information that can be stored digitally (such as but not limited to pictures, text, games, financial records, and personal information).
A PDN can (but need not) be or include a home entertainment network. For example, a PDN could be implemented in a business environment or elsewhere, to protect financial data or other content that is neither digital video nor digital audio.
While a PDN may include a personal computer, one is not required. For example, a PDN can be a collection of devices that are not personal computers but are essentially peer-level consumer appliances (e.g., audio/video receivers, disc players, and/or recording/playback units), and the network management functions can be distributed among such devices without the need for a centralized master controller. Distribution of network management functions is often desirable, such as (for example) in cases in which it will be necessary or desirable to perform essential network management functions from any device (or any of many devices) of a PDN.
Computing systems having open architecture (sometimes referred to herein as “open architecture systems” or “open systems”) are computing systems configured to allow end users to add or remove hardware components and/or software modules conveniently. It should be noted that consumer appliances can share design and implementation features with a personal computer, with the distinction between the two classes of devices being defined by their user-visible interfaces and functionality.
The expression “audiovisual subsystem” (or “audiovisual system”) is sometimes used herein to denote a system or device capable of displaying images in response to video data and/or emitting sound in response to audio data. Audiovisual subsystems are commonly coupled to a PDN by some form of serial link. Examples of audiovisual systems include: an HDTV monitor (including an HDMI receiver capable of decrypting HDCP-encrypted video and audio data received over an HDMI link), loudspeakers, a Digital Video Recorder (DVR), and an audio/video processor.
In typical embodiments of the invention, content that enters a PDN can be used within the PDN in any way that does not conflict with the restrictions placed on it by the owner (or licensor) of the intellectual property that pertains to the content (e.g., in any way that does not violate the terms of agreement under which the content was legally acquired by the user or owner of the PDN). For example, a PDN might be capable of receiving a satellite transmission of encrypted video and audio data that define a movie, and the Use Restriction Set for the data might prohibit all uses of the data except for decryption of the data, and any number of viewings of the movie (i.e., any number of playbacks of the video data and/or the corresponding audio data) by any device or devices of the PDN within a specified period (e.g., a specific day or week) or any number of viewings of the movie (by any device or devices of the PDN) up to a maximum allowable number of viewings. Preferred embodiments of the invention permit content that enters a PDNE to be decrypted, copied, stored, displayed and/or played back by devices of the PDN, and transmitted between devices of the PDN, provided that the Use Restriction Set for the content does not proscribe such uses.
In accordance with typical embodiments of the invention, the Use Restriction Set for content received by a PDN is indicated by data (sometimes referred to herein as “rights data” or “permitted use data” or a “permitted use flag”) that is associated with the content on entry into the PDNE, and this association is securely maintained throughout the content's existence within the PDNE in accordance with a basic set of rules that map to the Use Restriction Set.
The expression “transcryption” of encrypted data (data encrypted in accordance with a first protocol) herein denotes decryption of the encrypted data followed by re-encryption of the decrypted data in accordance with a second protocol, all performed within a physically secure device or system (e.g., a physically secure subsystem of a PDN) so that the data are never accessible outside the device in unencrypted form. The second protocol is typically different than the first protocol but could be the same as the first protocol (e.g., if different keys are used to perform the re-encryption than were used to perform the original encryption). Transcryption is performed, in accordance with the invention, whenever encrypted content enters a PDNE from another domain (e.g., from a secure transport domain such as a cable or satellite delivery system, or from a DVD-like disc distribution mechanism), unless the content is already encrypted in a desired format upon entering the PDNE.
Modern Personal Computers (PCs) have evolved from strictly computing devices into communication and entertainment devices. As a result, users expect to be able to view prerecorded video entertainment, including feature length movies, on their PC. In addition, the increased performance of processors makes it appear advantageous to use software on a PC's processor to, for example, decode and play DVD movies. However, the owners of entertainment intellectual property (e.g., copyrights in movies) rightly are concerned about unauthorized use and copying of their property when the relevant content enters such a PC.
It is contemplated that consumers of content will assemble PDNs (each of which could include, but often will not include, at least one PC), and that content providers will provide content to PDNs with the understanding that content entering each PDN will be used within the PDN in any way that is not proscribed by the owner (or licensor) of the intellectual property in the content. However, the owners of such intellectual property rightly are concerned about unauthorized use and copying of their property when the relevant content enters a PDN. This is because the open-systems nature of the PC makes it trivial to take highly valuable content (such as music or films) and distribute copies to untold millions of users who do not have the permission of the owner(s) of the relevant, highly valuable intellectual property to access this content.
Unfortunately, due to the very nature of software decode (in either open or closed system device implementations), content cannot be effectively protected in a conventional PDNE that employs software to decrypt content. At some point during the software decode process, both the keys and the decrypted content (e.g., plaintext video and audio data) are available within the registers and/or memory of the device, and therefore unauthorized copies of the keys or content can made and distributed without permission of the owner(s) of the relevant intellectual property.
If high quality copies of movies or other works can be made and distributed widely, e.g. via the Internet, then the intellectual property in such content quickly loses its value to the owner. In order to protect some such content, the Content Scrambling System (CSS) was created to encrypt video content for DVDs. CSS is a cryptographic scrambling mechanism used on top of an MPEG compressed version of the original, raw video data. Each device that can play DVD content must have one or more cryptographic keys that allow the content to be descrambled (i.e. decrypted).
A closed system (e.g., a standalone DVD player or other standalone consumer electronic gear) can provide considerable content protection if it is configured so that keys and decrypted content stay within the closed system. If both the keys and decrypted content stay within the closed system, there is no simple method for “cracking” the content protection method. A “closed” system (e.g., a standalone DVD player) does not provide a way for a user to add or remove hardware or software. Thus, it is relatively simple to ensure that keys are stored and used within the closed system in a way that does not reveal them outside the closed system. It is worth noting that even an intended closed system can suffer from the same vulnerabilities as an open system. For example, if a cable or satellite Set Top Box (STB) is implemented using an architecture similar to that of a PC, where software handles the secret keys, it is possible for the software to be modified so that this secret material is compromised.
However, protection of content within a closed system presents other problems. For example, how are keys and content delivered securely to a closed system? If both keys and content follow the same path, then there is an inherent unidirectional information flow to a closed system that precludes use of good authentication methods. An important aspect of preferred embodiments of the present invention is that such embodiments allow (but do not require) keys and content to follow different paths within a PDN and even within a content-handling integrated circuit (e.g., an integrated circuit embodiment of the inventive Ingress or Egress Node) within a PDN. These embodiments of the invention can make key distribution and management much more secure than it is in either conventional closed or open systems by ensuring that secret keying material is never directly visible to software. This is due to the fact that integrated circuits provide a much higher degree of security than is achievable in software implementations, owning to the physical security inherent in their packaging, the much larger investment in rare and expensive equipment needed to extract information from them, and the measures that can be taken to protect secret information. In addition, this approach is more secure because it enables the implementation of better methods to verify that a device (e.g., a closed subsystem of a PDN) is properly licensed and allowed to use content (subject to a Use Restriction Set for the content). The present invention improves the current state of the art for content protection in both closed and open systems.
Current standard definition DVD content can be decoded in software on PCs which are open systems rather than closed systems. At some point during the software decode process, both the CSS keys and the decrypted video content are available within the registers and/or memory of the PC. Since, in a PC, a user can either intentionally or unintentionally load malicious programs or drivers, and such modules can gain access to the keys and/or content, the CSS protection is easily circumvented. In fact, two widely published attacks have been made. First, the CSS key for the Xing software decoder was found by reverse engineering the software module, and this key was traded among hackers. In addition, a CSS decryption program called DeCSS was created and distributed.
So far, the economic damage of these breaches of the content protection system has been limited because the image quality of standard definition video is much lower than theatrical quality. That is, much of the intrinsic value of the original movie is lost in the conversion from the higher definition original to standard TV definition. In addition, until recently it has been impractical to transfer large files, like decrypted movies, between users.
Today, High Definition TV (HDTV) is becoming more popular, and is expected to supplant standard definition TV in a few years. In order to provide consumers with prerecorded material of sufficient quality, HDTV DVDs (HD-DVDs) are being designed. As in the case of standard DVD players, standalone players for HD-DVDs with something similar to CSS should provide strong content protection.
However, decoding content (e.g., HD-DVD content) within a conventional open system or other conventional PDN creates a vulnerability. This vulnerability is often referred to as the “software hole” in content protection systems. The essence of “software hole” vulnerability is that if software within an open system (or other element of a PDN) manipulates either unencrypted keys or plaintext content, the keys or content are easily revealed for unauthorized uses. For example, if an open computing system programmed with software is employed to decrypt content, both the keys and the decryption program must be visible to the processor and, therefore, visible to other, potentially malicious, software that is loaded within the system. The software hole is a serious problem because, if unauthorized copies of binary data (indicative of audiovisual content) can be made, the copies will allow display and playback of the content with essentially the same quality as the original theatrical release. In addition, modern network technology will easily enable a Napster-like trading of copies of movies. As a result, the owner of the intellectual property will quickly find that the property has become worthless.
When software decryption of standard DVDs was initially deployed, the “software hole” was not completely understood. Keys within decryption software were obscured and thought to be secure. This “security through obscurity” was quickly shown to be illusory when the Xing key was extracted. Since then, much of the effort of the computer industry has gone into secure methods of storing the decryption key (e.g., the Microsoft Palladium Initiative, later renamed as the Next Generation Secure Computing Base). However, although this would make stealing the keys more challenging, it does not substantially improve security of the keys and does nothing to protect the content. Note that if the authorized player can obtain the key without manual intervention (e.g. the user entering a password needed to decrypt the content protection key), then any other program using the same procedure or algorithm can also obtain the key. If such a program was written in a malicious manner, the key could, for example, be sent over the Internet to millions of others in a few seconds. Similarly, since a software decoder requires that the key and decryption process or algorithm be visible to the processor, it can be observed and emulated by the attacker, resulting in unauthorized decryption of the content.
Above-referenced U.S. patent application Ser. No. 10/679,055 describes methods and apparatus for avoiding the software hole problem (in an open system) by protecting both content and keys a closed subsystem within the open system, where “closed subsystem” denotes a subsystem (e.g., a single integrated circuit) that does not provide users a convenient way to add hardware or software thereto or remove hardware or software therefrom. U.S. patent application Ser. No. 10/679,055 teaches that the closed subsystem should be designed to prevent key data (used by the closed subsystem) and unencrypted content in the closed subsystem from being disclosed outside the closed subsystem.
The closed subsystem of U.S. patent application Ser. No. 10/679,055 could be referred to as being “embedded” in an open system, and is typically configured to generate the protected content by decrypting incoming content in hardware to generate raw content and then re-encrypting the raw content using a different content protection protocol (also in hardware, and in the same chip in which the raw content is generated) without revealing the raw content to any element of the open system outside the closed subsystem. Neither the raw content nor key data used for generating or re-encrypting the raw content is revealed to any element of the open system outside the closed subsystem. The closed subsystem can be configured to assert the re-encrypted content directly to an external system (a system external to the open system). The external system can include a cryptographic device, and the closed subsystem can be configured to disclose key data to the cryptographic device (e.g., as part of a verification operation) as necessary to enable the cryptographic device to decrypt the re-encrypted content. Alternatively, the re-encrypted content is asserted from the closed subsystem through at least one other element of the open system to an external system (e.g., the re-encrypted content is “tunneled” through the open system to the external system).
The trend in the industry for sending video content to display devices is to deliver the content in digital form over serial links.
Various serial links for transmitting encrypted or non-encrypted data are well known. One conventional serial link, used primarily in consumer electronics (e.g., for high-speed transmission of video data from a set-top box to a television set) or for high-speed transmission of video data from a host processor (e.g., a personal computer) to a monitor, is known as a transition minimized differential signaling interface (“TMDS” link). The characteristics of a TMDS link include the following:
1. video data are encoded and then transmitted as encoded words (each 8-bit word of digital video data is converted to an encoded 10-bit word before transmission);
a. the encoding determines a set of “in-band” words and a set of “out-of-band” words (the encoder can generate only “in-band” words in response to video data, although it can generate “out-of-band” words in response to control or sync signals. Each in-band word is an encoded word resulting from encoding of one input video data word. All words transmitted over the link that are not in-band words are “out-of-band” words);
b. the encoding of video data is performed such that the in-band words are transition minimized (a sequence of in-band words has a reduced or minimized number of transitions);
c. the encoding of video data is performed such that the in-band words are DC balanced (the encoding prevents each transmitted voltage waveform that is employed to transmit a sequence of in-band words from deviating by more than a predetermined threshold value from a reference potential. Specifically, the tenth bit of each “in-band” word indicates whether eight of the other nine bits thereof have been inverted during the encoding process to correct for an imbalance between running counts of ones and zeroes in the stream of previously encoded data bits);
2. the encoded video data and a video clock signal are transmitted as differential signals (the video clock and encoded video data are transmitted as differential signals over conductor pairs);
3. three conductor pairs are employed to transmit the encoded video, and a fourth conductor pair is employed to transmit the video clock signal; and
4. signal transmission occurs in one direction, from a transmitter (typically associated with a desktop or portable computer, or other host) to a receiver (typically an element of a monitor or other display device).
A use of the TMDS serial link is the “Digital Visual Interface” interface (“DVI” link) adopted by the Digital Display Working Group. A DVI link can be implemented to include two TMDS links (which share a common conductor pair for transmitting a video clock signal) or one TMDS link, as well as additional control lines between the transmitter and receiver. A DVI link includes a transmitter, a receiver, and the following conductors between the transmitter and receiver: four conductor pairs (Channel 0, Channel 1, and Channel 2 for video data, and Channel C for a video clock signal), Display Data Channel (“DDC”) lines for bidirectional communication between the transmitter and a monitor associated with the receiver in accordance with the conventional Display Data Channel standard (the Video Electronics Standard Association's “Display Data Channel Standard,” Version 2, Rev. 0, dated Apr. 9, 1996), a Hot Plug Detect (HPD) line (on which the monitor transmits a signal that enables a processor associated with the transmitter to identify the monitor's presence), Analog lines (for transmitting analog video to the receiver), and Power lines (for providing DC power to the receiver and a monitor associated with the receiver). The Display Data Channel standard specifies a protocol for bidirectional communication between a transmitter and a monitor associated with a receiver, including transmission by the monitor of an Extended Display Identification (“EDID”) message that specifies various characteristics of the monitor, and transmission by the transmitter of control signals for the monitor.
Another serial link is the “High Definition Multimedia Interface” interface (sometimes referred to as an “HDMI” link or interface) developed Silicon Image, Inc., Matsushita Electric, Royal Philips Electronics, Sony Corporation, Thomson Multimedia, Toshiba Corporation, and Hitachi.
It is common practice today to use the cryptographic protocol known as the “High-bandwidth Digital Content Protection” (“HDCP”) protocol to encrypt digital video to be transmitted over a DVI or HDMI link and to decrypt the data at the DVI (or HDMI) receiver. The HDCP protocol is described in the document “High-bandwidth Digital Content Protection System,” Revision 1.0, dated Feb. 17, 2000, by Intel Corporation, and the document “High-bandwidth Digital Content Protection System Revision 1.0 Erratum,” dated Mar. 19, 2001, by Intel Corporation. The full text of both of these documents is incorporated herein by reference.
A DVI-compliant (or HDMI-compliant) transmitter implementing the HDCP protocol asserts a stream of pseudo-randomly generated 24-bit words, known as cout[23:0], during each active period (i.e. when DE is high). In a DVI-compliant system, each active period is an active video period. In an HDMI-compliant system, each active period is a period in which video, audio, or other data are transmitted. Each 24-bit word of the cout data is “Exclusive Or'ed” (in logic circuitry in the transmitter) with a 24-bit word of RGB video data input to the transmitter, in order to encrypt the video data. The encrypted data are then encoded (according to the TMDS standard) for transmission. The same sequence of cout words is also generated in the receiver. After the encoded and encrypted data received at the receiver undergo TMDS decoding, the cout data are processed together with the decoded video in logic circuitry in order to decrypt the decoded data and recover the original input video data.
Before the transmitter begins to transmit HDCP encrypted, encoded video data, the transmitter and receiver communicate bidirectionally with each other to execute an authentication protocol (to verify that the receiver is authorized to receive protected content, and to establish shared secret values for use in encryption of input data and decryption of transmitted encrypted data). More specifically, each of the transmitter and the receiver is preprogrammed (e.g., at the factory) with a 40-bit word known as a key selection vector, and an array of forty 56-bit private keys. To initiate the first part of an authentication exchange between the transmitter and receiver, the transmitter asserts its key selection vector (known as “AKSV”), and a pseudo-randomly generated session value (“An”) to the receiver. In response, the receiver sends its key selection vector (known as “BKSV”) and a repeater bit (indicating whether the receiver is a repeater) to the transmitter, and the receiver also implements a predetermined algorithm using “AKSV” and the receiver's array of forty private keys to calculate a secret value (“Km”). In response to the value “BKSV” from the receiver, the transmitter implements the same algorithm using the value “BKSV” and the transmitter's array of forty private keys to calculate the same secret value (“Km”) as does the receiver.
Each of the transmitter and the receiver then uses the shared value “Km,” the session value “An,” and the repeater bit to calculate a shared secret value (the session key “Ks”), a value (“R0”) for use in determining whether the authentication is successful, and a value (“M0”) for use during a second part of the authentication exchange. The second part of the authentication exchange is performed only if the repeater bit indicates that the receiver is a repeater, to determine whether the status of one or more downstream devices coupled to the repeater requires revocation of the receiver's authentication.
After the first part of the authentication exchange, and (if the second part of the authentication exchange is performed) if the receiver's key selection vector is not revoked as a result of the second part of the authentication exchange, each of the transmitter and the receiver generates a 56-bit frame key Ki (for initiating the encryption or decrypting a frame of video data), an initialization value Mi, and a value Ri used for link integrity verification. The Ki, Mi, and Ri values are generated in response to a control signal (identified as “ctl3” in FIG. 1), which is received at the appropriate circuitry in the transmitter, and is also sent by the transmitter to the receiver, during each vertical blanking period, when DE is low. As shown in the timing diagram of FIG. 1, the control signal “ctl3” is a single high-going pulse. In response to the Ki, Mi, and Ri values, each of the transmitter and receiver generates a sequence of pseudo-randomly generated 24-bit words cout[23:0]. Each 24-bit word of the cout data generated by the transmitter is “Exclusive Or'ed” (in logic circuitry in the transmitter) with a 24-bit word of a frame of video data (to encrypt the video data). Each 24-bit word of the cout data generated by the receiver is “Exclusive Or'ed” (in logic circuitry in the receiver) with a 24-bit word of the first received frame of encrypted video data (to decrypt this encrypted video data). The 24-bit words cout[23:0] generated by the transmitter are content encryption keys (for encrypting a line of input video data), and the 24-bit words cout[23:0] generated by the receiver are content decryption keys (for decrypting a received and decoded line of encrypted video data).
During each horizontal blanking interval (in response to each falling edge of the data enable signal DE) following assertion of the control signal ctl3, the transmitter performs a rekeying operation and the receiver performs the same rekeying operation to change (in a predetermined manner) the cout data words to be asserted during the next active video period. This continues until the next vertical blanking period, when the control signal ctl3 is again asserted to cause each of the transmitter and the receiver to calculate a new set of Ki and Mi values (with the index “i” being incremented in response to each assertion of the control signal ctl3). The Ri value is updated once every 128 frames. Actual encryption of input video data or decryption of received, decoded video data (or encryption of input video, audio, or other data, or decryption of received, decoded video, audio, or other data, in the case of an HDMI-compliant system) is performed, using the cout data words generated in response to the latest set of Ks, Ki and Mi values, only when DE is high (not during vertical or horizontal blanking intervals).
Each of the transmitter and receiver includes an HDCP cipher circuit (sometimes referred to herein as an “HDCP cipher”) of the type shown in FIG. 2. The HDCP cipher includes linear feedback shift register (LFSR) module 80, block module 81 coupled to the output of LFSR module 80, and output module 82 coupled to an output of block module 81. LFSR module 80 is employed to re-key block module 81 in response to each assertion of an enable signal (the signal “ReKey” shown in FIG. 2), using the session key (Ks) and the current frame key (Ki). Block module 81 generates (and provides to module 80) the key Ks at the start of a session and generates (and applies to module 80) a new value of key Ki at the start of each frame of video data (in response to a rising edge of the control signal “ctl3,” which occurs in the first vertical blanking interval of a frame). The signal “ReKey” is asserted to the FIG. 2 circuit at each falling edge of the DE signal (i.e., at the start of each vertical and each horizontal blanking interval), and at the end of a brief initialization period (during which module 81 generates an updated value of the frame key Ki) after each rising edge of signal “ctl3.”
Module 80 consists of four linear feedback shift registers (having different lengths) and combining circuitry coupled to the shift registers and configured to assert a single output bit per clock interval to block module 81 during each of a fixed number of clock cycles (e.g., 56 cycles) commencing on each assertion of the signal “ReKey” when DE is low (i.e., in the horizontal blanking interval of each line of video data). This output bit stream is employed by block module 81 to re-key itself just prior to the start of transmission or reception of each line of video data.
Block module 81 comprises two halves, “Round Function K” and “Round Function B,” as shown in FIG. 3. Round Function K includes 28-bit registers Kx, Ky, and Kz, seven S-Boxes (each a 4 input bit by 4 output bit S-Box including a look-up table) collectively labeled “S-Box K” in FIG. 3, and linear transformation unit K, connected as shown. Round Function B includes 28-bit registers Bx, By, and Bz, seven S-Boxes (each a 4 input bit by 4 output bit S-Box including a look-up table) collectively labeled “S-Box B” in FIG. 3, and linear transformation unit B, connected as shown. Round Function K and Round Function B are similar in design, but Round Function K performs one round of a block cipher per clock cycle to assert a different pair of 28-bit round keys (Ky and Kz) each clock cycle in response to the output of LFSR module 80, and Round Function B performs one round of a block cipher per clock cycle, in response to each 28-bit round key Ky from Round Function K and the output of LFSR module 80, to assert a different pair of 28-bit round keys (By and Bz) each clock cycle. The transmitter generates value An at the start of the authentication protocol and the receiver responds to it during the authentication procedure. The value An is used to randomize the session key. Block module 81 operates in response to the authentication value (An), and the initialization value (Mi) that is updated by output module 82 at the start of each frame (at each rising edge of the control signal “ctl3”).
Each of linear transformation units K and B outputs 56 bits per clock cycle. These output bits are the combined outputs of eight diffusion networks in each transformation unit. Each diffusion network of linear transformation unit K produces seven output bits in response to seven of the current output bits of registers Ky and Kz. Each of four of the diffusion networks of linear transformation unit B produces seven output bits in response to seven of the current output bits of registers By, Bz, and Ky, and each of the four other diffusion networks of linear transformation unit B produces seven output bits in response to seven of the current output bits of registers By and Bz.
In Round Function K, one bit of register Ky takes its input from the bit stream asserted by module 80 when the ReKey signal is asserted. In Round Function B, one bit of register By takes its input from the bit stream asserted by module 80 when the ReKey signal is asserted.
Output module 82 performs a compression operation on the 28-bit keys (By, Bz, Ky and Kz) asserted to it (a total of 112 bits) by module 81 during each clock cycle, to generate one 24-bit block of pseudo-random bits cout[23:0] per clock cycle. Each of the 24 output bits of module 82 consists of the exclusive OR (“XOR”) of nine terms as follows: (B0*K0)+(B1*K1)+(B2*K2)+(B3*K3)+(B4K4)+(B5*K5)+(B6*K6)+(B7)+(K7), where “*” denotes a logical AND operation and “+” denotes a logical XOR operation.
In the transmitter, logic circuitry 83 (shown in FIG. 2) receives each 24-bit word of cout data and each input 24-bit RGB video data word, and performs a bitwise XOR operation thereon in order to encrypt the video data, thereby generating a word of the “data_encrypted” data indicated in FIG. 2. Typically, the encrypted data subsequently undergoes TMDS encoding before it is transmitted to a receiver. In the receiver, logic circuitry 83 (shown in FIG. 2) receives each 24-bit block of cout data and each recovered 24-bit RGB video data word (after the recovered data has undergone TMDS decoding), and performs a bitwise XOR operation thereon in order to decrypt the recovered video data.
Throughout the specification the expression “TMDS-like link” will be used to denote a serial link capable of transmitting encoded data (e.g., encoded digital video data), and optionally also a clock for the encoded data, from a transmitter to a receiver, and optionally also capable of transmitting (bidirectionally or unidirectionally) one or more additional signals (e.g., encoded digital audio data or other encoded data) between the transmitter and receiver, that is or includes either a TMDS link or a link having some but not all of the characteristics of a TMDS link. Examples of TMDS-like links include links that differ from TMDS links only by encoding data as N-bit code words (where N is not equal to 10, and thus the code words are not 10-bit TMDS code words) and links that differ from TMDS links only by transmitting encoded video over more than three or less than three conductor pairs. Some TMDS-like links encode input video data (and other data) to be transmitted into encoded words comprising more bits than the incoming data using a coding algorithm other than the specific algorithm used in a TMDS link, and transmit the encoded video data as in-band characters and the other encoded data as out-of-band characters (HDMI-compliant systems encode audio data for transmission according to an encoding scheme that differs from the encoding scheme employed for video data). The characters need not be classified as in-band or out-of-band characters based according to whether they satisfy transition minimization and DC balance criteria. Rather, other classification criteria could be used. An example of an encoding algorithm, other than that used in a TMDS link but which could be used in a TMDS-like link, is IBM 8b10b coding. The classification (between in-band and out-of-band characters) need not be based on just a high or low number of transitions. For example, the number of transitions of each of the in-band and out-of-band characters could (in some embodiments) be in a single range (e.g., a middle range defined by a minimum and a maximum number of transitions).
The term “transmitter” is used herein in a broad sense to denote any unit capable of transmitting data over a link and optionally also encoding and/or encrypting the data to be transmitted. The term “receiver” is used herein in a broad sense to denote any unit capable of receiving data that has been transmitted over a link (and optionally also decoding and/or decrypting the received data). Unless otherwise specified, a link can but need not be a TMDS-like link or other serial link. The term transmitter can denote a transceiver that performs the functions of a receiver as well as the functions of a transmitter.
The expression “content key” herein denotes data that can be used by a cryptographic device to encrypt content (e.g., video, audio, or other content), or to denote data that can be used by a cryptographic device to decrypt encrypted content.
The term “key” is used herein to denote a content key, or data that can be used by a cryptographic device to generate or otherwise obtain (in accordance with a content protection protocol) a content key. The expressions “key” and “key data” are used interchangeably herein.
The term “stream” of data as used herein denotes that all the data are of the same type and are transmitted from a source to a destination device. All or some of the data of a “stream” of data together may constitute a single logical entity (e.g., a movie or song, or portion thereof).
The term “HDCP protocol” is used herein in a broad sense to denote both the conventional HDCP protocol and modified HDCP protocols that closely resemble the conventional HDCP protocol but differ therefrom in one or more respects. Some but not all embodiments of the invention implement an HDCP protocol. The conventional HDCP protocol encrypts (or decrypts) data during active video periods but not during blanking intervals between active video periods. An example of a modified HDCP protocol is a content protection protocol that differs from the conventional HDCP protocol only to the extent needed to accomplish decryption of data transmitted between active video periods (as well as decryption of video data transmitted during active video periods) or to accomplish encryption of data to be transmitted between active video periods (as well as encryption of video data to be transmitted during active video periods).
A example of an HDCP protocol that is a modified version of the conventional HDCP protocol is an “upstream” variation on the conventional HDCP protocol (to be referred to as an “upstream” protocol). A version of the upstream protocol is described in the Upstream Linkfor High-bandwidth Digital Content Protection, Revision 1.00, by Intel Corporation, Jan. 26, 2001 (referred to hereinafter as the “Upstream Specification”). In the upstream protocol, the “transmitter” is a processor programmed with software for implementing the upstream protocol to communicate with a graphics controller (with the graphics controller functioning as a “receiver”). Such a processor can send video data to the graphics controller after executing an authentication exchange in accordance with the “upstream” protocol. The processor and graphics controller can be elements of a personal computer configured to send encrypted video data from the graphics controller to a display device. The graphics controller and display device can be configured to execute another encryption protocol (e.g., the above-mentioned conventional HDCP protocol, which can be referred to in this context as the “downstream” HDCP protocol) to allow the graphics controller (this time functioning as a “transmitter”) to encrypt video data and send the encrypted video to the display device, and to allow the display device (functioning as a “receiver”) to decrypt the encrypted video.
However, in contrast to the present invention, the upstream protocol would not provide adequate protection to raw content that is present in a processor of a personal computer or PDN where the processor is programmed with software for implementing the upstream protocol (with the processor functioning as a “transmitter”) to communicate with (and send the raw content to) a graphics controller functioning as a “receiver,” to allow the graphics controller (this time functioning as a “transmitter”) to encrypt the raw content and transmit the resulting encrypted content (in accordance with the “downstream” HDCP protocol) to a device (e.g., a display device) external to the open system.
There are a number of structural flaws in the upstream protocol, and a personal computer or PDN that implements the upstream protocol would be subject to at least one attack in which the attacker could access the raw content present within the personal computer or PDN. An example of such an attack is a “man-in-the-middle” attack, in which the upstream authentication requests (from the graphics controller) are intercepted and the corresponding responses (to the graphics controller) are forged. A personal computer that implements the upstream protocol is easily attacked for one fundamental reason: at least two of the system elements (the application and the video driver) are in software. They can be debugged, de-compiled, altered, and copied, with any resulting “hack” potentially distributed quickly and easily across the Internet. Thus, the upstream protocol is fundamentally flawed and will allow people of ordinary skills (and with no special hardware or tools) to bypass the intended HDCP protections. Furthermore, this can happen on a large scale, and can not readily be detected or counteracted.
Aspects of the present invention are generalizations of teaching of above-referenced U.S. patent application Ser. No. 10/679,055. These and some other aspects of the present invention are methods and apparatus for protecting content in a PDN, including by avoiding the above-described software hole problem. In accordance with some aspects of the present invention, plaintext content and secrets used to accomplish decryption of the content are protected within hardware (e.g., one or more integrated circuits) in a PDN, and are encrypted whenever present outside such hardware in the PDN.