Computers and other digital devices create and use data or information in many ways. The Microsoft® Press Computer Dictionary, 3d Edition (1997) defines the term data as the “ . . . [p]lural of the Latin datum, meaning an item of information. In practice, data is often used for the singular as well as the plural form of the noun. Compare information.” The term information is, in contradistinction, defined by the Microsoft(Press Computer Dictionary, 3d Edition (1997) as “ . . . [t]he meaning of data as it is intended to be interpreted by people. Data consists of facts, which become information when they are seen in context and convey meaning to people. Computers process data without any understanding of what the data represents.”
Without manifestly excluding or restricting the broadest definitional scope entitled to such terms, the following are non-limiting examples of data and information, which will be readily apparent to those of ordinary skill in the art and are intended to illustrate no clear disavowal of their ordinary meaning.
Data often refers to distinct pieces of information, usually but not always formatted in a special way. Collections of information or data may be kept in files. The Microsoft® Press Computer Dictionary, 3d Edition (1997) defines the term file as “ . . . [a] complete, named collection of information, such as a program, a set of data used by a program, or a user-created document. A file is the basic unit of storage that enables a computer to distinguish one set of information from another. A file is the “glue” that binds a conglomeration of instructions, numbers, words, or images into a coherent unit that a user can retrieve, change, delete, save, or send to an output device.”
Again, without manifestly excluding or restricting the broadest definitional scope entitled to such term, the following are non-limiting examples of files, which will be readily apparent to those of ordinary skill in the art and are intended to illustrate no clear disavowal of its ordinary meaning. Almost all information in computers and other digital devices may be stored in a file. There are many different types of files: data files, directory files, executable files, program files, text files, etc. Different types of files usually store different types of information. For example, a program file stores a program, whereas text files store text.
In database management systems, for example, data files are usually the files that store the database information, whereas other files, such as index files and data dictionaries, store administrative information, known as metadata. Executable files, on the other hand, are files in a format that the computer can directly execute. Unlike source files, executable files usually cannot be read by humans. One typically passes a source file through a compiler or assembler in order to transform it into an executable file. Nevertheless, such differing types of files are deemed to be data or information, which may be processed within the scope of various embodiments of the present invention.
The term data is often used to distinguish binary machine-readable information from textual human-readable information. For example, some applications make a distinction between data files (i.e., files that contain binary data) and text files (i.e., files that contain ASCII data). Text files stored in ASCII format are sometimes called ASCII files. Text editors and word processors are usually capable of storing data in ASCII format, although ASCII format is not always the default storage format. Most data files, particularly if they contain numeric data, are usually not stored in ASCII format. Executable programs are almost never stored in ASCII format.
Therefore, it should be understood that data as used herein may comprise information encoded by means for representing characters as numbers, such as the ASCII, extended ASCII or high ASCII formats, the ISO Latin 1 set of characters, which is used by many operating systems, as well as Web browsers, EBCDIC, and the like. Methods, apparatus and computer program products according to various embodiments of the present invention may work with any such data and information, including executable files.
Data and information as used herein may also comprise a bitstream. As is known, a bitstream is a series of binary digits representing a flow of information transferred through a given medium. Such sequences of bits are transmitted across an electronic link, and the software controlling the link is typically unaware of any structure inherent in the bitstream data. In synchronous communications, bitstreams comprise a continuous flow of data in which characters within the bitstream are separated from one another by the receiving station rather than by markers, such as start and stop bits, inserted into the data.
Data and information as used herein may also comprise an ASCII string, a bit string, whether contiguous or non-contiguous with other bit strings, a byte string, a character string, data elements, or data sets as those terms may be used in forms of digital imaging (e.g., digital radiography, radiotherapy, x-ray, positron emission tomography, ultrasound, and magnetic resonance imaging) according to the joint work of the American College of Radiology (ACR) and the National Electrical Manufacturers Association (NEMA), published in the Digital Imaging and Communications in Medicine PS 3-1998 (DICOM Standard).
Data and information as used herein may also comprise streams/streaming, which is generally known as the transferring of data in a manner that allows it to be processed (e.g., displayed) as the data is transferred, rather than requiring all the data to be transferred before it can be used. Streaming is often useful in accelerating access to large audio or video files, or where the stream is ongoing. Thus, data streaming—commonly used in the terms “audio streaming” or “video streaming”—is when data moves from one computer to another and does not have to be completely downloaded for the receiving computer to do something with it.
As is typically used in computing in regard to the organization of data within databases or information systems, the term entity refers to a piece of data—an object or concept about which data is stored. A relationship, on the other hand, is how the data is shared between entities.
Those of ordinary skill in the art would readily appreciate that there are three types of relationships between entities: one-to-one, one-to-many, and many-to-many. An example of a one-to-one relationship occurs where one instance of an entity (A) is associated with one other instance of another entity (B). For example, in a database of employees, each employee name (A) is associated with only one social security number (B).
An example of a one-to-many relationship occurs where one instance of an entity (A) is associated with zero, one or many instances of another entity (B), but for one instance of entity B there is only one instance of entity A. For example, for a company with all employees working in one building, the building name (A) is associated with many different employees (B), but those employees all share the same singular association with entity A.
Finally, a many-to-many relationship occurs where one instance of an entity (A) is associated with one, zero or many instances of another entity (B), and one instance of entity B is associated with one, zero or many instances of entity A. For example, for a company in which all of its employees work on multiple projects, each instance of an employee (A) is associated with many instances of a project (B), and at the same time, each instance of a project (B) has multiple employees (A) associated with it.
It should be appreciated, therefore, that data and information as used herein may also comprise entities, instances, and objects.
As is known by those of ordinary skill in the art, a database may be stored in data chunks within some data storage environments. The data chunks may be separated from each other physically, through the use of file structure, or they may be abstractions in a contiguously stored database. For example, a database may be stored using multiple compressed files, each representing a data chunk, which may reside on the same physical computer-readable medium, such as, for example, a single hard drive, or multiple computer-readable mediums connected by a network, such as, for example, multiple hard drives in a server farm. Or, a database may be stored using multiple backup tapes, with each backup tape representing a data chunk. It may also be possible to combine physical and file structure separation of the data chunks, for example, by storing a database in multiple compressed files spread across multiple backup tapes, where each compressed file may represent a data chunk.
Therefore, it should be appreciated that data and information as used herein may also comprise data chunks.
Computers and other digital devices often work together in “networks.” A network is a group of two or more digital devices linked together (e.g., a computer network). There are many types of computer networks, including: local-area networks (LANs), where the computers are geographically close together (e.g., in the same building); and wide-area networks (WANs), where the computers are farther apart and are connected by telephone lines, fiber-optic cable, radio waves and the like.
In addition to the above types of networks, certain characteristics of topology, protocol, and architecture are also used to categorize different types of networks. Topology refers to the geometric arrangement of a computer system. Common topologies include a bus, star, and ring. Protocol defines a common set of rules and signals that computers on a network use to communicate. One of the most popular protocols for LANs is called Ethernet. Another popular LAN protocol for personal computers is the IBM token-ring network. Architecture generally refers to a system design. Networks today are often broadly classified as using either a client/server architecture or a peer-to-peer architecture.
The client/server model is an architecture that divides processing between clients and servers that can run on the same computer or, more commonly, on different computers on the same network. It is a major element of modern operating system and network design.
A server is a program, or the computer on which that program runs, that provides a specific kind of service to clients. A major feature of servers is that they can provide their services to large numbers of clients simultaneously. A server may thus be a computer or device on a network that manages network resources (e.g., a file server, a print server, a network server, or a database server. For example, a file server is a computer and storage device dedicated to storing files. Any user on the network can store files on the server. A print server is a computer that manages one or more printers, and a network server is a computer that manages network traffic. A database server is a computer system that processes database queries.
Servers are often dedicated, meaning that they perform no other tasks besides their server tasks. On multiprocessing operating systems, however, a single computer can execute several programs at once. A server in this case could refer to the program that is managing resources rather than the entire computer.
The client is usually a program that provides the user interface, also referred to as the front end, typically a graphical user interface or “GUI”, and performs some or all of the processing on requests it makes to the server, which maintains the data and processes the requests.
The client/server model has some important advantages that have resulted in it becoming the dominant type of network architecture. One advantage is that it is highly efficient in that it allows many users at dispersed locations to share resources, such as a web site, a database, files or a printer. Another advantage is that it is highly scalable, from a single computer to thousands of computers.
An example is a web server, which stores files related to web sites and serves (i.e., sends) them across the Internet to clients (i.e., web browsers) when requested by users. By far the most popular web server is Apache, which is claimed by many to host more than two-thirds of all web sites on the Internet.
The X Window System, the dominant system for managing GUIs on Linux and other Unix-like operating systems, is unusual in that the server resides on the local computer (i.e., on the computer used directly by the human user) instead of on a remote machine (i.e., a separate computer anywhere on the network) while the client can be on either the local machine or a remote machine. However, as is always true with the client/server model, the ordinary human user does not interact directly with the server, but in this case interacts directly with the desktop environments (e.g., KDE and Gnome) that run on top of the X server and other clients.
The client/server model is most often referred to as a two-tiered architecture. Three-tiered architectures, which are widely employed by enterprises and other large organizations, add an additional layer, known as a database server. Even more complex multi-tier architectures can be designed which include additional distinct services.
Others network models include master/slave and peer-to-peer. In the former, one program is in charge of all the other programs. In the latter, each instance of a program is both a client and a server, and each has equivalent functionality and responsibilities, including the ability to initiate transactions. That is, peer-to-peer architectures involve networks in which each workstation has equivalent capabilities and responsibilities. This differs from client/server architectures, in which some computers are dedicated to serving the others. Peer-to-peer networks are generally simpler and less expensive, but they usually do not offer the same performance under heavy loads.
Computers and other digital devices on networks are sometimes also called nodes. Each node has a unique network address, and comprises a processing location.
The term “user” as used herein may typically refer to a person (i.e., a human being) using a computer or other digital device on the network. However, since the verb “use” is ordinarily defined (see, e.g., Webster's Ninth New Collegiate Dictionary 1299 (1985)) as “to put into action or service, avail oneself of, employ,” clients and servers in networks according to known client/server architectures, peers in networks according to known peer-to-peer architectures, and nodes in general may—without human intervention—“put into action or service, avail themselves of, and employ” methods according to embodiments of the present invention.
Without manifestly excluding or restricting the broadest definitional scope entitled to such terms, the following are non-limiting examples of a “user,” which will be readily apparent to those of ordinary skill in the art and are intended to illustrate no clear disavowal of their ordinary meaning: a person (i.e., a human being) using a computer or other digital device, in a standalone environment or on the network; a client installed within a computer or digital device on the network, a server installed within a computer or digital device on the network, or a node installed within a computer or digital device on the network.
In the following description and claims, the terms “append”, “attach”, “couple” and “connect,” along with their derivatives, may be used. It should be readily appreciated to those of ordinary skill in the art that these terms are not intended as synonyms for each other. Rather, in particular embodiments, “append” may be used to indicate the addition of one element as a supplement to another element, whether physically or logically. “Attach” may mean that two or more elements are in direct physical contact. However, “attach” may also mean that two or more elements are not in direct contact with each other, but may associate especially as a property or an attribute of each other.
Likewise, in particular embodiments, “connected” may be used to indicate that two or more elements are in direct physical or electrical contact with each other. “Coupled” may mean that two or more elements are in direct physical or electrical contact. However, “coupled” may also mean that two or more elements are not in direct contact with each other, but yet still cooperate or interact with each other.
It will become readily apparent to those of ordinary skill in the art after reading the following that none of the aforementioned computers or other digital devices, processing data and information formats currently provide means for proving—with certainty—the dates and times associated with access, creation, modification, receipt, or transmission of such data or information. This is due to the variety of application programs which are available for data access, creation, modification, receipt, and transmission, but also due to the much more varied “standards” and protocols put forth in the vain attempt to provide uniformity worldwide.
Counterfeit and Altered Realities
The production, propagation and eventual reliance on digital information has become commonplace. Nearly every aspect of our lives is recorded digitally, stored digitally, transmitted, and accessed digitally. At each step of the process of accessing, creating, modifying, transmitting and receiving this digital information, there is the chance that the information can be altered, copied, forged, or otherwise tampered with. Often times the tampering is not detectable, and in many cases damage is caused.
For instance, digital data presents many options for its handling, which allow for several types of forgeries to be created. A minor re-touching of an image or the re-formatting of a text document still results in changes to the information being presented, may alter the perceived reality, and one may unwittingly verify such a re-touched or altered document because the changes go unnoticed or unchallenged. Further, even computer enhanced data (such as digital images) is still computer generated data, and subject to all the vulnerabilities native to digital data as described herein. In the circumstances surrounding the admissibility and reliance on electronic documents in courts of law, even the most minor alterations (and even enhancements) can have grave consequences, particularly when there is no way to track the changes or alterations to specific and certain dates and times.
In the various environments described below, the need for verification of data is universal. These environments do not themselves, however, necessarily require such verification. Instead, it is the implementation of devices, which do not consider or account for the commonplace reliance on digitally processed information that do require such verification.
Illustrative of the enormity and ubiquity of the problem are the following operating environments, within which the systems and methods according to the present invention can provide the time certainty, which is presently ignored in each environment.
Digital Document Processing
“Processing” may be viewed as the manipulation of data within a computer system. Since virtually all computer systems today process digital data, processing is the vital step between receiving the data in binary format (i.e., input), and producing results (i.e., output)—the task for which computers are designed.
The Microsoft® Press Computer Dictionary, 3d Edition (1997) defines the term document as “ . . . any self-contained piece of work created with an application program and, if saved on disk, given a unique filename by which it can be retrieved.” Most people think of documents as material done by word processors alone. To the typical computer, however, data is little more than a collection of characters. Therefore, a database, a graphic, or a spreadsheet can all be considered as much a document as is a letter or a report. In the Macintosh environment in particular, a document is any user-created work named and saved as a separate file.
Accordingly, for the purpose of the invention described herein, digital document processing shall be interpreted to mean the manipulation of digital (i.e., binary) data within a computer system to create or modify any self-contained piece of work with an application program and, if saved on a disk or any other memory means, given a unique filename by which it can be retrieved. Examples of such application programs with which the present invention may be used to assist in such digital document processing are Microsoft® Access 97, Microsoft® Excel 97, and Microsoft® Word 97, each available from Microsoft Corporation, Redmond, Wash. U.S.A.
Digital Communications
“Communications” may be broadly defined as the vast discipline encompassing the methods, mechanisms, and media involved in information transfer. In computer-related areas, communications usually involve data transfer from one computer to another through a communications medium, such as a telephone, microwave relay, satellite link, or physical cable.
Two primary methods of digital communications among computers presently exist. One method temporarily connects two computers through a switched network, such as the public telephone system. The other method permanently or semi-permanently links multiple workstations or computers in a network. In reality, neither method is distinguishable from the other, because a computer can be equipped with a modem, which is often used to access both privately owned and public access network computers.
More particular forms of digital communications (i.e., exchange of communications in which all of the information is transmitted in binary-encoded, digital format) include electronic mail (or less formally “e-mail”), facsimile, voicemail, and multimedia communications.
E-mail may be broadly defined as the exchange of text messages/computer files over a communications network, such as a local area network (LAN) or the Internet, usually between computers or terminals. Facsimile (or, again, less formally “fax”) comprises the transmission and reception of text or graphics over telephone lines in digitized form. Conventional fax machines scan an original document, transmit an image of the document as a bit map, and reproduce the received image on a printer. Resolution and encoding of such fax messages are standardized in the CCITT Groups 1-4 recommendations. Fax images can likewise be sent and received by computers equipped with fax hardware and software.
The CCITT Groups 1-4 recommendations make up a set of standards recommended by the Comité Consultatif International Télégraphique et Téléphonique (now known as the International Telecommunication Union) for encoding and transmitting images over fax machines. Groups 1 and 2 relate to analog devices, which are generally out of use. Groups 3 and 4 deal with digital devices, and are outlined below.
Group 3 is a widespread standard that supports “standard” images of 203 horizontal dots per inch (dpi) by 98 vertical dpi, and “fine” images of 203 horizontal dpi by 198 vertical dpi. Group 3 devices support two methods of data compression. One is based on the Huffman code, and reduces an image to 10 to 20 percent of the original. The other, known as “READ” (for “relative element address designate”), compresses an image to about six to twelve percent (˜6%-12%) of its original. Additionally, the READ method provides for password protection as well as polling, so that a receiving machine can request transmission as appropriate.
Group 4 is a newer standard, which supports images of up to 400 dpi. Its method of data compression is based on a beginning row of white pixels, or “dots”, with each succeeding line encoded as a series of changes from the line before. Images are compressed to about three to ten percent (˜3%-10%) of the original. Group 4 devices do not include error-correction information in their transmission. Moreover, they require an Integrated Services Digital Network (ISDN) phone line rather than a traditional dial-up line.
Fax modems may also be used to send and receive digital data encoded in known fax formats (e.g., one of the CCITT groups noted above). Such data is either sent or received by a fax machine or another modem, which then decodes the data and converts it to an image. If the data was initially sent by fax modem, the image must previously have been encoded on the computer hosting such fax modem. Text and graphic documents can be converted into fax format by special software that is usually provided with the fax modem. Paper documents must first be scanned in. As is well known, fax modems may be internal or external and may combine fax and conventional modem capabilities.
Voicemail generally comprises a system that records and stores telephone messages in a computer's memory. Unlike a simple answering machine, voicemail systems include separate mailboxes for multiple users, each of whom can copy, store, or redistribute messages. Another type of digital communications involving voice is “voice messaging”, a term which generally refers to a system that sends and receives messages in the form of sound recordings. Typical voice messaging systems may employ “voice modems”, which are modulation/demodulation devices that support a switch to facilitate changes between telephony and data transmission modes. Such a device might contain a built-in loudspeaker and microphone for voice communication, but more often it uses the computer's sound card.
Still another form of digital communications includes multimedia communications in the style of “video teleconferencing”, as defined by the International Telecommunication Union (formerly CCITT) in “Visual Telephone Systems and Equipment for Local Area Networks Which Provide a Non-Guaranteed Quality of Service,” (Recommendation H.323, Telecommunication Standardization Sector of ITU, Geneva, Switzerland, May 1996) and other similar such standards.
Digital Imaging
“Digital imaging” encompasses those known processes involved in the capture, storage, display, and printing of graphical images. They may involve devices known as a “digital camera”, which broadly refers to a camera that stores photographed images electronically instead of on traditional film. Digital cameras typically use charge-coupled device (CCD) elements to capture the image through the lens when the operator releases the shutter in the camera. Circuits within the camera cause the image captured by the CCD to be stored in a storage medium, such as solid-state memory or a hard disk. After the image has been captured, it is downloaded by cable to the computer using software supplied with the camera. Once stored in the computer, the image can be manipulated and processed much like the image from a scanner or related input devices. Digital cameras come in the form of still cameras and full-motion video recorders.
Other forms of digital imaging include digitizing systems, such as the “PhotoCD®” system from Eastman Kodak Company, Rochester, N.Y. That system allows 35 mm film pictures, negatives, slides, and scanned images to be stored on a compact disc. Images are then stored in a file format known as the Kodak PhotoCD Image Pac File Format, or PCD. Many photography and film development businesses offer this service. Any computer with CD-ROM capabilities can usually view images stored on a PhotoCD and the software required to read PCD. Additionally, such images can be viewed by any one of a variety of players that are specifically designed to display images stored on CDs. Another photographic form of digital imaging is defined by the “Flashpix” specification, the cooperative endeavor of the Digital Imaging Group, Microsoft, the Hewlett-Packard Company, and Live Picture, Inc. The Flashpix format builds on the best features of existing formats (e.g., Kodak Image Pac, Live Picture IVUE, Hewlett-Packard JPEG, TIFF, TIFF/EP, etc.), and combines these features with an object orientated approach.
Still other forms of digital imaging include digital radiography, radiotherapy, x-ray, positron emission tomography, ultrasound, and magnetic resonance imaging according to the joint work of the American College of Radiology (ACR) and the National Electrical Manufacturers Association (NEMA), published in the Digital Imaging and Communications in Medicine PS 3-1998 (DICOM Standard).
Digital Commerce
An enormous amount of commercial activity now takes place by means of connected computers. Such commercial activity has been variously coined as digital commerce, electronic commerce, or just plain E-commerce. Regardless of its particular moniker, these activities generically involve a commercial transaction between a user and a vendor through an online information service, the Internet, or a BBS, or between vendor and customer computers through a specialized form of E-commerce known as electronic data interchange (EDI).
EDI is collectively known for its set of standards to control the transfer of business documents (e.g., purchase orders and invoices) between computers. The ultimate goal of EDI is the elimination of paperwork and increased response time. For EDI to be most effective, users must agree on certain standards for formatting and exchanging information, such as the X.400 protocol and CCITT X series.
Other known forms of E-commerce include digital banking, web-front stores, and online trading of bonds, equities, and other securities. Digital banking can take the form of access to a user's account, payment of bills electronically, or transfer of funds between a user's accounts. Web-front stores (e.g., amazon.com) usually comprise a collection of web pages in the form of an electronic catalog, which offers any number of products for sale. More often than not, transactions at such web-front stores are consummated when a purchaser enters his credit card number, and the issuing bank approves the purchase. These transactions may or may not be over secure lines, such as those designated “TRUSTe” participant web sites. Further details regarding known processes for establishing and maintaining secure E-commerce connections may be found in the SET Secure Electronic Transaction Specification, Book 1: Business Description (version 1.0), May 31, 1997, the contents of which are incorporated herein by reference. See also Book 2 (Programmer's Guide) and Book 3 (Formal Protocol Definition) of the SET Secure Electronic Transaction Specification, as well as the External Interface Guide to SET Secure Electronic Transaction, Sep. 24, 1997, each of which is incorporated herein by reference.
One burgeoning form of E-commerce that has arisen in the past few years is that which involves dealing in securities online. “Day traders” watch impatiently as ticker symbols speed across their computer screens. When the price is right, they electronically whisk their order off to a distant securities dealer-often buying and selling the same stock or bond in a fifteen-minute span of time. One can only imagine the potential problems associated with the purchase or sale of securities when price-per-share movements on the order of a few cents make the difference to these day traders. Fortunately, the National Association of Securities Dealers (NASD) has come up with its Order Audit Trail Systems (OATS) to track all stock transactions. NASD Rule 6953 also requires all member firms that have an obligation to record order, transaction, or related data under the NASD Rules or Bylaws to synchronize the business clocks that are used for recording the date and time of any market event. Computer system and mechanical clocks must be synchronized every business day before market open, at a minimum, in order to ensure that recorded order event timestamps are accurate.
Digital Justice
Even legal scholars and systems around the world have been unable to escape the problems of an online world. Utah became the first jurisdiction in the United States of America to enact legislation creating “cybernotaries”. Similar laws in Georgia, Florida, and Massachusetts quickly followed Utah. In Riverside, California in 2003, individuals were found to have altered computerized court records to create dismissals.
In August 1996, the American Bar Association (through its Information Security Committee of the Electronic Commerce and Information Technology Division, Section of Science and Technology) published the Digital Signature Guidelines—Legal Infrastructure for Certification Authorities and Secure Electronic Commerce. The European Union, as well, in a final report on the Legal Issues Of Evidence And Liability In The Provision Of Trusted Services (CA and TTP Services), let its position be known in October 1998.
Each of the environments noted above is fraught with potential fraud. Any reliance they may have on dates and times is merely for the purpose of determining whether the transaction is valid (i.e., authorized within a specified range of time), or what specific time delays occur in the transmission of data between the computer systems communicating with one another. However, none of those environments currently provide means for proving—with certainty—dates and times associated with access, creation, modification, receipt, or transmission of digital data files, which may be used therein.
Attempts to Solve the Problem
Many-varied computing means pervade today's society. PCs, web browsers, e-mail clients, e-mail servers, network file servers, network messaging servers, mainframes, Internet appliances, wireless telephones, pagers, PDAs, fax machines, fax modems, digital still cameras, video cameras, voice recorders, video recorders, copiers, and scanners, and virtually any other device using digital data files are fast becoming ubiquitous.
Digital data is easy to modify. As a result, it has been nearly impossible in the prior art to establish with certainty the date and time a particular digital data file in a given computing means was accessed, created, modified, received, or transmitted. It should be understood that, by use of the term “computing means”, the present invention is directed to general purpose computers, PCs, web browsers, e-mail clients/servers, network file/messaging servers, mainframes, Internet appliances, wireless telephones, pagers, PDAs, fax machines, digital still/video cameras, digital voice/video recorders, digital copiers/scanners, interactive television, hybrid combinations of any of the above-noted computing means and an interactive television (e.g., set-top boxes), and any other apparatus, which generally comprises a processor, memory, the capability to receive input, and the capability to generate output.
Such computing means typically include a real time clock (“RTC”) for keeping track of the time and date. Likewise, operating systems and/or applications programs used in such computing means usually stamp the time and date (as derived from the RTC) that each of the digital data files is accessed, created, modified, received, or transmitted. Such stamping of digital data files with times and dates (collectively referred to as “time-stamping”) has, thus, become an integral part of all of the above known computing environments.
Although the existing framework of time-stamping can be used to catalogue and sort one's own files, for other critical needs it suffers from two fatal flaws. Files are typically “time-stamped” with a value read from the RTC. There is no simple way of determining whether the RTC is set to the correct date and time. Indeed, it is quite trivial for a user to reset the RTC to any desirable date and time. Even if the computing means' RTC had been correctly set, nothing would prevent a user from arbitrarily changing the “time-stamps” themselves. This is readily accomplished through the direct manipulation of the digital data where the time-stamp is stored. As a consequence, changing such time stamps results in the creation of counterfeit data, because the time representation and the file content are not strongly bound to each other.
Thus, the known time-stamping framework is useless for any situation where the accuracy of the date or time of a digital data file is critical. Court filings, medical records, files presented as incriminating or exculpatory evidence in court cases, legal documents such as wills, billing records, patent, trademark, and copyright claims, and insurance documents are only a few of the areas where the date and time that is associated with the file is critical. Conventional systems and methods that time-stamp digital data files fail to meet this need. Furthermore, there is no “open”, cross-platform, interoperable global standard in place to create trusted time-stamps.
Cryptographic Systems and Keys
One approach that has been used in the past to provide some level of security in digital data files is the use of cryptographic systems and keys. In general, cryptographic systems are used to encrypt or “lock” a digital data file. A key is used, conversely, to decrypt or “unlock” an encrypted digital data file. Digital data files are merely bits of data in memory or on a network. If this data is viewed as the mere representation of large numbers, then mathematical functions or algorithms can be easily applied to the data.
For example, where a particular digital data file is a text file, its unencrypted or “cleartext” version can be viewed as the variable x. The resulting function of this variable x, when encrypted by its associated cryptographic algorithm and coupled with its key k will be f (k, x). Accordingly, the encrypted text or “cyphertext” can be defined by the equation:y=f(k,x).
By choosing the cryptographic algorithm carefully—such that there is no easily discovered inverse mapping (i.e., for any given y, it will be extremely difficult to calculate x without knowing k, while at the same time, with knowledge of k it will be possible)—the data may be encrypted.
Symmetric Cryptography
If the key for encryption and decryption is the same shared secret, then the cryptographic system and associated algorithm will be referred to as “symmetric”. Both the sender and the receiver must share the key in such symmetric cryptographic systems. A sender first applies the encryption function using the key to the cleartext to produce the cyphertext, which is then sent to a receiver. The receiver applies the decryption function using the same shared key. Since the cleartext cannot be derived from the cyphertext without knowledge of the key, the cyphertext can be sent over public networks such as the Internet.
The current United States standard for symmetric cryptography, in which the same key is used for both encryption and decryption, is the Data Encryption Standard (DES), which is based upon a combination and permutation of shifts and exclusive ors. This approach can be fast, whether implemented directly on hardware (e.g., 1 GByte/sec throughput or better) or in general purpose processors. The current key size of 56 bits (plus 8 parity bits) is sufficient, yet somewhat small, but the growing use of larger keys with “triple DES” generate much greater security. Since the implementation of DES is fast, it can easily be pipelined with software codecs and not impact system performance.
An alternative and yet stronger form of symmetric block encryption is IDEA. Its security is based upon combining exclusive ors with addition and multiplication in modulo-16 arithmetic. The IDEA approach is also fast on general purpose processors. It is comparable in speed to known DES implementations. One major advantage of IDEA is its keys, which are 128 bits and are, thus, much stronger (i.e., harder to break) than standard 56-bit DES keys.
One particular problem with the use of such symmetric systems is the problem of getting the sender and the receiver to agree on the key without anyone else finding out. Moreover, the problem becomes greatly complicated when additional users (i.e., potential senders and receivers) are added to the system. Such symmetric cryptographic systems, nevertheless, are by far easier to implement and deploy than their asymmetric counterparts since they require far less infrastructure. Sometimes with a symmetric cryptographic system, however, keys are submitted over the network. Avoidance of this security risk would be desirable.
Asymmetric Cryptography
Systems that generate and employ a secure key pair (i.e., a “private key” for creating the “digital signature” and a “public key” to verify that digital signature) are typically known as asymmetric cryptographic systems. There are many known cryptographic algorithms (e.g., RSA, DSA, and Diffie Hellman) that involve a key pair. In such asymmetric cryptographic systems, the private key and the public key are mathematically linked. The private key can only decrypt anything that is encrypted by the public key. Conversely, the public key can only verify anything that is signed by the private key. Asymmetric cryptographic systems are, thus, inherently more secure than symmetric or shared secret systems. The sensitive private key need exist in only one place. No form of the private key is ever transmitted over the network. Typical asymmetric cryptographic systems also scale to many users more easily than shared secret systems. However, the infrastructure that is necessary to field systems of this type, commonly called a “Public Key Infrastructure” (PKI), is non-trivial to implement. See, e.g., RFC 1422, Privacy Enhancement for Internet Electronic Mail: Part II: Certificate-Based Key Management (February 1996), the contents of which are incorporated herein by reference.
Digital Signatures
Referring now to FIGS. 1 and 2, wherein like reference characters or numbers represent like or corresponding parts throughout each of the several views, an exemplary process 100 for creating a digital signature is shown in FIG. 1. To sign a document, or for that matter any other digital data file, a “signer” must first delimit the borders of the digital data file to be signed. As used herein, the term signer refers to any person who creates a digital signature for a message, such as message 110. The information delimited by the signer, in turn, refers to that message 110. A hash function 120 in the signer's software is used to compute a hash result 130, which is unique for all practical purposes to the message 110. Thereafter, a signing function 140 is used to transform the hash result 130 into a digital signature 160, but only after input of the signer's private key 150.
This transformation is sometimes referred to as a process of encryption. However, such a characterization would be inaccurate, because message 110 itself may, or may not be confidential. Confidentiality may be provided as an optional feature in most digital signature technologies, but the separate and distinct security service of confidentiality is not central to the security services of signer authentication, document authentication, or digital data file authentication. In any case, the resulting digital signature 160 is unique to both the message 110 and the private key 150, which is used to create the digital signature 160.
Typically, most digital signatures 160 (i.e., the digitally-signed hash result of message 110) are used in one of two ways. They may be attached to their associated message 110 and, thereafter, simply stored. In the alternative, they may be copied 170 and coupled with digital signature 160, in the form of a single data element 180 and, thereafter, transmitted 190 to a verifier.
This single data element 180 is, in some cases as will be described in greater detail herein below, referred to as a “digital certificate”. Furthermore, the digital signature 160 may be simply transmitted or stored as a separate data element, so long as it maintains a reliable association with its message 110. Each digital signature 160 is unique to the specific message 110, which has been used to create it. Otherwise, it would be counterproductive if the digital signature 160 was wholly disassociated from that message 110.
An exemplary verification process 200 for verifying digital signature 160 is shown in FIG. 2. Element 180, comprising digital signature 160 attached to message 110, is first received 190 from the signer. A new hash result 220 of the original message 110 is then computed by the verifier by means of the same hash function 120 used to create the digital signature 160.
It should be noted at this juncture that use of the term “to verify” herein, with respect to any given digital signature, message, and public key, refers to those processes of accurately determining that: (1) the digital signature 160 was created during the “operational period” of a valid certificate 180 by the private key 150 corresponding to the public key 260 listed in the certificate 180; and (2) the message 110 had not been altered since its digital signature 160 was created.
It should also be noted at this juncture that use of the term “operational period” herein refers to a period that begins on a date and time a certificate 180 is issued by a “certification authority”, or on a later date and time certain if stated in the certificate 180, and ends on a date and time it expires or is earlier revoked or suspended.
Then, by use of the public key 260 and such new hash result 220, the verifier can check: (1) whether the digital signature 160 was created using the signer's private key 150; and (2) whether the newly computed hash result 220 matches the original hash result 130, which was transformed into the digital signature 160 during the signing process.
Most known verification software will confirm the digital signature 160 as “verified” if two conditions are satisfied. One condition will be satisfied if the signer's private key 150 was used to digitally sign the message 110. This condition will be met if the signer's public key 26Q was used to verify the digital signature 160, because the signer's public key 260 is capable of verifying only a digital signature 160 that is created with the signer's private key 150. The other condition will be satisfied if message 110 was received unaltered. This condition will be met if the hash result 220 that is computed by the verifier turns out to be identical to the hash result 130 that is extracted from digital signature 160 during the verification process. A verifier function 240 is used to make these comparisons, while further processing of the message 110 is dependent upon whether message 110 is determined to be valid at step 280.
The term “digital certificate” as used herein generally refers to any message, which at least (1) identifies the certification authority (CA) issuing it; (2) names or identifies its “subscriber”; (3) contains the subscriber's public key; (4) identifies its operational period; and (5) is digitally signed by the CA issuing it. Metaphorically, digital certificates serve as electronic substitutes for a sealed envelope or a signer's signature. In one case, for example, VeriSign Digital ID™ (a trademark of VeriSign, Inc., Mountain View, Calif.) securely resides in a signer's Internet browser or e-mail software, and enables that signer to digitally sign and encrypt e-mail. Digital certificates can also be viewed as electronic equivalents of a driver's license or a passport. Containing information that uniquely identifies the signer, the digital certificate allows the signer to: (1) digitally sign a message so the recipient knows that a message actually originated from the signer; and (2) encrypt a message so the intended recipient can decrypt and read its contents and attachments. Most digital certificates are easy to use, with point-and-click interfaces in all of the popular browsers and e-mail packages. A person seeking to verify a digital signature needs, at a minimum, (1) the public key corresponding to the private key used to create the digital signature, and (2) reliable evidence that the public key (and thus the corresponding private key of the key pair) is identified with the signer. The basic purpose of the digital certificate is to serve both these needs in a reliable manner.
Dual Signatures
As noted herein above, digital signatures and digital certificates have both been used in the past to provide some level of certainty as to the identity of a particular person accessing, creating, modifying, receiving, or transmitting a digital data file. E-commerce presents other challenges for securing digital data files. In particular, the process of providing secure electronic transactions has raised the concerns for maintaining a person's privacy. An approach that has been used in the past to provide such security is known as “dual signatures”, and is illustrated below.
User B wants to send User A an offer to purchase a piece of property that User A owns and an authorization to his bank to transfer the money if User A accepts the offer. Nevertheless, User B does not want the bank to see the terms of his outstanding offer to User A, nor does he want User A to see his bank account information. User B also wants to link his offer to the transfer such that the money will only be transferred if User A accepts his offer. According to the SET Secure Electronic Transaction Specification, User B accomplishes all of this by digitally signing both messages with a single signature operation that creates a dual signature.
Such a dual signature is generated in four steps. First, a message digest is created for both messages sent by User B (i.e., one to User A, and one to the bank). The resulting pair of message digests is then concatenated together. Next, a message digest of the concatenated result is created. This third message digest is finally encrypted with the User B's private signature key. User B must include the message digest of the other message in order for a recipient to verify his dual signature. The recipient of either message can check then its authenticity by generating the message digest on its copy of the message, concatenating it with the message digest of the other message (as provided by the User B), and thereafter computing the message digest of the result. If the newly generated digest matches the decrypted dual signature, the recipient can trust the authenticity of the message.
In the event that User A accepts User B's offer; she sends a message to the bank indicating her acceptance and including the message digest of the offer. The bank can verify the authenticity of User B's transfer authorization, and ensure that the acceptance is for the same offer by using its digest of the authorization and the message digest presented by User A of the offer to validate the dual signature. On the one hand, the bank can therefore check the authenticity of the offer against the dual signature. It cannot, on the other hand, see the terms of the offer.
Further details regarding such known processes may be found in the SET Secure Electronic Transaction Specification, Book 1: Business Description (Version 1.0), May 31, 1997, the contents of which are incorporated herein by reference. See also Book 2 (Programmer's Guide) and Book 3 (Formal Protocol Definition) of the SET Secure Electronic Transaction Specification, as well as the External Interface Guide to SET Secure Electronic Transaction, Sep. 24, 1997, each of which is incorporated herein by reference.
As is best illustrated by reference to FIG. 3, the process of creating such dual signatures will now be described in greater detail. User A runs the property description 305 through a one-way algorithm 310 to produce a unique value known as the message digest 315. This is a kind of digital fingerprint of the property description 305, and will be used later to test the integrity of the message. She then encrypts the message digest 315 with her private signature key 320 to produce her digital signature 325. Next, she generates a random symmetric key 330 and uses it to encrypt the combination of the property description 305, her signature 325 and a copy of her certificate 335 containing her public signature key 340 (collectively referred to as the message 345).
To decrypt the property description 305, user B will require a secure copy of this random symmetric key 330. User B's certificate 350, which user A must have obtained prior to initiating secure communication with him, contains a copy of his public key-exchange key 355. To ensure secure transmission of the symmetric key 330, user A encrypts it first using user B's public key-exchange key 350. The encrypted key, referred to as the digital envelope 360, will then be sent to user B along with the encrypted message 345 itself.
Likewise, the decryption process consists of the following steps. User B receives the message 345 from user A and decrypts the digital envelope 360 with his private key-exchange key 365 to retrieve the symmetric key 330. He uses the symmetric key 330 to decrypt the property description 305, user A's signature 325, and her certificate 335. He decrypts user A's digital signature 325 with her public signature key 340, which he acquires from her certificate 335. This recovers the original message digest 315 of the property description 305. He runs the property description 305 through the same one-way algorithm 310 used by user A and produces a new message digest 370 of the decrypted property description 305. Finally, he compares his message digest 370 to the one 315 obtained by use of user A's public signature key 340 contained within her digital signature 325. If both digests 315, 370 are exactly the same, user B then confirms that the message content has not been altered during transmission and that it was signed using user A's private signature key 320. On the other hand, if digests 315, 370 are not the same, then message 305 either originated somewhere else or was altered after it was signed. User B could then elect to take some appropriate action, such as notifying user A or discarding the message 305.
Digital Time-Stamps
A digital time-stamping service (DTS) issues time-stamps, which associate a date and time with a digital document in a cryptographically strong way. The digital time-stamp can be used at a later date to prove that an electronic document existed at the time stated on its time-stamp. For example, a physicist who has a brilliant idea can write about it with a word processor and have the document time-stamped. The time-stamp and document together can later prove that the scientist deserves the Nobel Prize, even though an arch rival may have been the first to publish.
The manner in which such conventional time-stamping systems work is illustrated in FIG. 4. Hypothetically, a user at a computing means 400 signs a document and wants it time-stamped. The user first computes a message digest 420 of the document using a secure hash function, and second sends the message digest 420 (but not the document itself to the DTS 440. The DTS 440 sends the user in return a digital time-stamp 460 consisting of the message digest, the date and time it was received at the DTS 440, and the signature 480 of the DTS 440. Since the message digest 420 does not reveal any information about the content of the document, the DTS 440 cannot eavesdrop on the documents it time-stamps. Thereafter, the user can ostensibly present the document and time-stamp 460 together to prove when the document was written. A verifier then computes the message digest 420 of the document, makes sure it matches the digest in the time-stamp 460, and verifies the signature 480 of the DTS 440 on the time-stamp 460.
To be reliable, the time-stamps must not be forgeable. The DTS 440 itself must have a long key if the time-stamps are to be reliable for long periods of time (e.g., several decades). Moreover, the private key of the DTS 440 must be stored with utmost security, as in a tamperproof box. The date and time must come from a clock, also inside the tamperproof box, which cannot be reset and which will keep accurate time for years or perhaps for decades. It must also be infeasible to create time-stamps without using the apparatus in the tamperproof box.
All of the above requirements greatly complicate the process of obtaining legally sufficient proof of the date and time a digital data file was accessed, created, modified, or transmitted. In fact, time-stamping a document in the manner described above only certifies the date and time that the message digest 420 was received by the DTS. It provides no proof of the date and time that the document was accessed, created, modified, or transmitted. Moreover, because the DTS is located remotely relative to the user, there is no reliable way to provide a digital time-stamp locally at the user's site.
One cryptographically-strong DTS, first implemented by Bell Communications Research, Inc. (also known as “Bellcore”), only uses software and avoids many of the requirements just described such as tamperproof hardware. It essentially combines hash values of documents into data structures known as binary trees. The “root” values of such binary trees are then periodically published in the newspaper. In these Bellcore systems, the time-stamp consists of a set of hash values, which allow a verifier to recompute the root of the tree. Since the hash functions are one-way, the set of validating hash values cannot be forged. The time associated with the document by the time-stamp is the date of publication.
The following Bellcore patents are illustrative of the above-described approach: U.S. Pat. No. 5,136,646, for “Digital Document Time-Stamping With Catenate Certificate” (Haber et al.); U.S. Pat. No. 5,136,647, for a “Method for Secure Time-Stamping of Digital Documents” (Haber et al.); U.S. Pat. No. 5,373,561, for a “Method for Secure Time-Stamping of Digital Documents” (Haber et al.); and U.S. Pat. No. Re. 34,954, which is the reissue of the '647 patent noted above and is, likewise, directed to a “Method for Secure Time-Stamping of Digital Documents” (Haber et al.). Other patents which are illustrative of similar such approaches are U.S. Pat. No. 5,748,738, for a “System and Method for Electronic Transmission, Storage and Retrieval of Authenticated Documents” (Bisbee et al.), which is assigned to Document Authentications Systems, Inc.; and U.S. Pat. No. 5,781,629, for a “Digital Document Authentication System” (Haber et al.), which is assigned to Surety Technologies, Inc. The contents of each of the above patents are incorporated herein by reference.
While each of the above approaches uses software and avoids many of the requirements for tamperproof hardware, they still require a trusted source at a remote location. None of the patents listed above teach or suggest any system or method that is capable of providing a trustworthy time-stamp at the precise location where the user's digital data files are accessed, created, modified, or transmitted. Moreover, all of the methods described in the patents listed above still leave open the possibility that two individuals may collude to falsely state the value of a hash.
Undetected alterations may still be made with appropriate cryptographic techniques. For example, one may alter a document as desired and then make other suppressed changes, such as a carriage return followed by a space-up command. Both original document and altered document may, therefore, have the same hash value. See, for example, B. Schneier, Applied Cryptography, Chapter 3.8, “Timestamping Services”, pages 61-65 (John Wiley & Sons, Inc. 1994), the contents of which are incorporated herein by reference.
One approach seeking to avoid such possibilities is described in U.S. Pat. No. 5,781,630 (Huber et al.), which discloses a system including a cryptomodule that is coupled to a computer. A cryptomodule in accordance with the Huber at al. patent includes a processor; an interface coupling the processor to the computer; and memory containing algorithms and constants for three purposes: (1) encoding a document, (2) generating a digital signature to be appended, attached, connected, or coupled to the document, and (3) producing a time-stamp to be inserted into the document. The cryptomodule also includes a pair of clocks, one of which is a radio clock and the other of which is a “non-adjustable” quartz clock.
This system according to the '630 patent depends on a comparison of the two clocks before inserting a time-stamp into the document. That is, the time that the document was created, edited, received, or transmitted is retrieved from both clocks and compared. Any discrepancy between the times retrieved is then determined. If, and only if, those discrepancies are sufficiently small, will a time-stamp based on the radio clock be inserted into the document and the document then encoded.
Another approach, which seeks to avoid problems of collusion and/or fraud, is described in U.S. Pat. No. 5,619,571 (Sandstrom et al.). Briefly summarized, Sandstrom et al. discloses an improved method of storing or retrieving electronic records, particularly those in the form of image streams (e.g., TIFF). An image identification code, time data provided by a trusted source, and a password are combined to generate a key. The image identification code and time data are stored in a public directory associated with the image data stream. Attributes of the image stream (e.g., its size and a hash of at least a segment of the image data) are also determined. The attributes are then used to generated a verification code. Subsequently, the verification code is first positioned within a private area associated with the data image stream, and then the private area is encrypted with the previously generated key.
This approach, however, suffers from two obvious disadvantages. Not only is it limited to image file formats having public and private areas, but it is also still dependent on a remote source for the time-stamp and the image identification code. It would be much more desirable to provide systems and methods of time-stamping digital data files locally and without the continuing reliance on a remote trusted source.
Still another approach to provide authenticated documents, with an authenticated time code, is described in U.S. Pat. No. 5,189,700 (Blandford). Blandford's device includes an RTC and an encryption means, which are together sealed in a tamperproof package. Powered by a battery that is located outside the tamperproof package, the RTC is used either, (1) to supplant the system clock of a computer, such that the computer cannot be booted up with an incorrect time; or (2) to provide an encrypted authentication code of time. Such time code is derived from a time retrieved from the RTC, which is combined with a device identification number. A secret key contained within the encryption means then encrypts the combination.
While devices according to Blandford, in fact, meet the objective of providing a local source of trusted time, they nevertheless suffer from two major disadvantages. Both disadvantages arise out of the design requirements of such devices. First, Blandford requires the RTC to override the computer's system clock on boot up. It would be much more desirable to avoid changing system settings in the computer, particularly the setting of its system clock. Second, Blandford requires that the RTC be powered by a source (i.e., the battery) outside of the tamperproof package. This, it is suggested, is critical to assuring several objectives: (1) ensuring that the RTC cannot be reset, or it can be reset only under strict procedures; (2) allowing the battery to be replaced in the power-up state without affecting the RTC; and (3) disabling the device, and potentially even the computer, in the event that power from the source failed. Obviously, it would be much more desirable to avoid such inconveniences.