In the modern world, communications are passed among parties in a variety of different ways, utilizing many different communications media. Electronic communication is becoming increasingly popular as an efficient manner of transferring information, and electronic mail in particular is proliferating due to the immediacy of the medium.
Unfortunately, drawbacks accompany the benefits provided by electronic communication, particularly in the area of privacy. Electronic communications might be intercepted by unintended and unauthorized recipients. Wireless transmissions, such as voice communication by cellular telephone, and electronic mail, are especially susceptible to such interception.
The problem of electronic communication privacy has been addressed, and solutions to the problem have been put in place. One form of solution uses cryptography to provide privacy for electronic communication. Cryptography involves the encrypting or encoding of a transmitted or stored message, followed by the decryption or decoding of a received or retrieved message. The message usually takes the form of a digital signal, or a digitized analog signal. If the communication is intercepted during transmission or is extracted from storage by an unauthorized entity, the message is worthless to the interloper, who does not possess the means to decrypt the encrypted message.
In a system utilizing cryptography, the encrypting side of the communication incorporates an encoding device or encrypting engine. The encoding device accepts the plaintext (unencrypted) message and a cryptographic key, and encrypts the plaintext message with the key according to an encrypt relation that is predetermined for the plaintext communication and the key. That is, the message is manipulated with the key in a predetermined manner set forth by the text/key relation to produce a ciphertext (encrypted) message.
Likewise, the decrypting side of the communication incorporates a decoding device or decrypting engine. The decoding device accepts the ciphertext message and a cryptographic key, and decrypts the ciphertext message with the key according to a decrypt relation that is predetermined for the ciphertext message and the key. That is, the message is manipulated with the key in a predetermined manner set forth by the text/key relation to produce a new plaintext message that corresponds with the original plaintext message.
The manner in which the key and the relation are applied in the communication process, and the manner in which keys are managed, define a cryptographic scheme. There are many conventional cryptographic schemes in use today. For example, probably the most popular of these is a public-key cryptographic scheme. According to a scheme of this type, the keys used are actually combinations of a public key component that is available to anyone or to a large group of entities, and a private key component that is specific to the particular communication.
An important consideration in determining whether a particular cryptographic scheme is adequate for the application is the degree of difficulty necessary to defeat the cryptography, that is, the amount of effort required for an unauthorized person to decrypt the encrypted message. One way to improve the security of the cryptographic scheme is to minimize the likelihood that a valid key can be stolen, calculated, or discovered. The more difficult it is for an unauthorized person to obtain a valid key, the more secure communications will be under a particular scheme.
Typically there are two types of data exchange when dealing with business-to-business (“B2B”) e-commerce. One type involves a data stream being sent from one company to another, performed effectively in real time. The other type deals with files being transferred from one company to another, either singularly or in a batch. This latter type is where encryption might be especially important, since these files can be transferred using a non-secure protocol, such as FTP for example. Additionally, these files can be located outside of a company's firewall(s), and might only be protected from unauthorized access via a user name and password. Thus, encrypting stored files or files in transit will protect against disclosure of the information within the file to an unauthorized entity that has acquired possession of the file. However, this scheme will not work if authorized entities cannot reasonably access files to which they have authorization. They must still possess an appropriate key, and secure transportation of the cryptographic key can be problematic.
Use of a cryptographic key split combiner, such as that described in U.S. patent application Ser. No. 09/023,672, filed on Feb. 13, 1998, the entirety of which is incorporated herein by reference, provides a solution to the key transportation problem. Utilizing this combiner, a working key, or cryptographic session key, is generated by combining cryptographic key splits. This working key is then used to cryptographically secure a subject data element. Accordingly, decryption under this system requires possession of, or means to produce, the requisite key splits by which the working key can be formed.
For example, U.S. Pat. No. 5,369,702 to Shanton (and its progeny: U.S. Pat. Nos. 5,680,452; 5,717,755; and 5,898,781) describes a method for providing multi-level multimedia security in a data network. According to this method, an object-oriented key manager is accessed to begin the encryption process. From the view-point of data files, the “object” that is the subject of the encryption process can be a single file. It can also be a group of files, or a portion of a file. For example, the object can be a single document created using a word processing application, a group of such documents stored in a directory, or a paragraph within a document. The object can be a single Web page, or a series of linked pages, or a particular image file present on a Web page. The object can be a single e-mail message, an entire thread of e-mail messages, or a sentence, or header information, within the e-mail message. As explained in the Shanton patent, and as used herein, an object can be any data instance, whether a complete file, a group of files, or a portion of a file. Different portions of a file (overlapping or non-overlaping) can be separate objects within that file, which can itself be a separately identified object.
According to the system described in the Shanton patent, once an object is selected for encryption, a label is selected for the object, as well as a text/key relation for the encryption session. The object is then encrypted according to the text/key relation, and labeled according to the selected label. Labeled, encrypted objects can be embedded within other objects, in which case the container object can be encrypted or plaintext. To decrypt, the label is read, and access authorization is determined according to the label. The encrypted object can then be decrypted if the label indicates that access authorization is granted. Encrypted objects embedded within container objects are similarly decrypted.
Thus, objects can be encrypted with a specific granularity, and broadcast to a large group of recipients, with confidence that selected objects will only be accessed by designated recipients. For example, a simple text document created using a word processor can include several sections, each intended to be read by a different group within an organization. Each section can be identified as a separate object, and encrypted using a label that restricts decryption of that section to an identified group. Sections can overlap, in cases where more than one group is intended to have access to the same material within sections of the document. That document, with the encrypted object sections, can be broadcast to everyone within the organization, and only those persons having roles within the organization identified by the labels can access the plaintext versions of those sections. If the document has sections of general interest to everyone in the organization which are not restricted for access, the overall document can be transmitted or stored unencrypted, with only appropriate sections encrypted as described. If security considerations dictate that everything in the document should be restricted such that it can be accessed only by members of the organization, the document can be identified as a container object that is itself encrypted and labeled such that the document itself cannot be access by those outside the organization. Each person within the organization would be able to “unwrap” the overall document, but would only be able to access those sections to which he or she has access as determined by the labels on the objects within.
The Shanton system disclosed applicability of this cryptographic system to objects of all media types. Thus, the system can be applied to, for example text files, sound files, image files, and objects that are combinations of one or more of these media types. However, the Shanton patent only contemplated an object that was created in an application, to which the decrypted object is returned. The system was not disclosed as applicable to an object created as an XML file that need not be returned to a creating application, and that would have “tags” identifying portions of the document by type.
Shanton described the use of labels, but did not specify the nature of the label, or how the label should be applied. Co-pending U.S. patent application Ser. No. 09/023,672, filed on Feb. 13, 1998, the disclosure of which is incorporated herein by reference in its entirety, describes a system for formulating cryptographic keys from various respective sources of seed data. The seed data are provided to respective key split generators, which generate key splits, or components, based on the seed data. The key splits are then randomized together to form the cryptographic key used to encrypt the subject data. The seed data used to form the key splits can come from any of a number of sources. One of those sources can be label data, so that a label split is generated as a component of the cryptographic key, such that the key carries with it a label component. The label seed data can be provided in any of a number of ways. For example, the seed can be something as simple as an alphanumeric code keyed in by a user, where the code is a password or some other identifier of the user or the intended recipient that relates to labels used by the system. Alternatively, the label seed can be provided on a physical mechanism, such as a token, which can be read by the system to provide the required label data.
Thus, an object can be selected for encryption using the completed key, which can include label information imparted by the key split. The object (for example, a text file, Web page, multi-media file) is encrypted using the key to restrict access to those with keys formed using proper splits. If portions of a document are separate objects, each object is encrypted individually, then the document can be wrapped (or not) for broadcast to an entire recipient list.
The use of encryption for Web site content is important, because it is well known that Web sites on the Internet, and other files accessible by request to a server on a network, risk unauthorized access (for example, by hacking). If adequate precautions are not taken, server-accessible content (for example, Web pages and the like) can be replaced by a hacker, with embarrassing results. Further, sensitive data, such as price lists, salary information, and other information of a private nature, can be obtained and adversely used or disseminated to competitors, news media, or the general public. Encryption can provide a solution to both problems. By encrypting served content, the encrypted served context is unreadable to one without cryptographic authorization. Further, the encrypted served content cannot otherwise be readily modifiable or replaceable, because a Web server providing encrypted content could ensure that the content is encrypted before it is sent to a requesting recipient. In fact, recognition that a page had been altered or replaced would be a relatively uncomplicated matter for the Web server, using such known mechanisms as digital seeds and checksums. If modification is the only concern, use of digital seeds such as signatures can also provide a solution to this problem. If both content and modification are important, then encryption of the served content would be appropriate. A solution incorporating a cryptographic key split combiner such as that described above can provide multiple Web servers with keys for encryption of sensitive content, while allowing for authorized entities to decrypt appropriate files or other content at any time.
Various organizations have addressed the subject of cryptography in general, and also as particularly applied to specific forms of communication or data at rest. For example, a draft standard has been submitted to ANSI, X9.73-1 99x: Cryptographic Message Syntax, which specifies a cryptographic message syntax (“CMS”) that can be used to protect financial transactions and other documents from unauthorized disclosure and modification. This syntax is described fully in a working draft, which is herein incorporated by reference in its entirety.
Further complicating the situation is the proliferation of computer languages, such as the Extensible Markup Language (“XML”), that store data as plaintext, which is readily accessible by any party having access to a stored file, and not just to someone having and running a particular software application. A benefit to using this language to create documents is that storage of data as plaintext allows programmers to more easily debug applications, and in emergencies, to correct corrupted or invalid data (for example, fix a “broken” XML file) with a simple text editor. However, this flexibility also creates new opportunities for unauthorized access to and use of the data in the file.
XML, and other languages having its capabilities, is especially problematic due to its highly descriptive nature. XML is a markup language that is designed to allow an XML designer to describe stored data via custom-defined tags. An XML instance generally includes one or more data elements. XML can also provide a number of element types, such as, for example, root elements, child elements, element attributes, comments, plural elements, etc. Thus, each data element is provided with at least one respective tag that specifically describes the particular data element or group of elements. For example, a data element can be “1234123412341234123003”, and its respective tag can be “<credit card data>.” When stored as plaintext, the data element alone might not be readily identifiable or usable by an unauthorized party. However, when viewed with its descriptive tag, the data element's defined nature is known, and the risk of an undesirable disclosure or use is significantly increased compared with that of untagged data elements.
There are different views as to how encryption processes should be implemented, both in general and with regard to XML. For example, see “Specification of Element-wise XML Encryption,” Takeshi Imamura and Hiroshi Maruyama, August 2000, IBM Research document, Tokyo Research Laboratory. Also see “XML Encryption Syntax and Processing,” Aug. 10, 2000, from the W3C public XML Encryption list; and “Design and Implementation of an Access Control Processor for XML Documents”, Ernesto Damiani, Sabrina De Capitani di Vimercati, Stefano Paraboschi, and Pierangela Samarati, Aug. 10, 2000. However, no conventional approach establishes a linkage among objects, labels, and encryption to tagged data elements; or applies role-based or multiple-level object encryption methods, or any encryption scheme involving asymmetrical parameters, to tagged data elements.