1. Field of the Invention
The present invention relates to a data system, in particular a system for the distribution of digital data across a communications network.
2. Background
Existing digital data systems do not provide a complete solution to the secure distribution of digital information. More specifically, there is intellectual resistance to the idea of distributing information that is confidential by way of electronic means due to the fact that closed systems such as e-mail can be unsecured by the possibility of interception and that open systems such as the world wide web are essentially broadcast systems onto which a restriction has been placed. The problems mentioned here mean that the existing systems are not trusted to deliver confidentially or report accurately.
At present, the distribution of digital data through systems related to or connected to the internet requires a collection of technologies that may or may not be designed to work in concert and all require their own security measures. An example of a fragmented system would be the publication of a confidential digital document such as a business plan or the design for an as yet unannounced product being sent from one person to another via an electronic system.
The document may reside in a database or on a computer file system. In order for the document to made available to another person, one of three methods would normally be used:                (1) Sending the document via an e-mail system;        (2) Publishing the document on a visual browsing system such as a company Intranet or the world wide web; or        (3) Placing the document on the other person's system (their virtual desktop).        
These require the data to likely be handled by the following systems and in the following ways. In all cases below, the document will be encased in a computer file. A visual example of this system is given in FIG. 1.
(1) Email Protocol
In the first example, E-mail, the document is created using a proprietary application such as a word processor. This step is common in most systems where a message is complex, although it is also common to use the text editing function of the mail client application if the message is simple. The document is sent to an e-mail client application where it is placed in an electronic ‘outbox’ among other messages that are queued for dispatch. At the point of network connection, the messages in a time-ordered queue are dispatched over the network to an SMTP (Simple Mail Transport Protocol) server on an Internet computer that is operated by the provider of the Internet gateway on behalf of the sender of the e-mail. This provider could be the employer of the person sending the e-mail or a contracting company. The SMTP server sends the message by transmitting it across the Internet to the recipient who's address is specified in the header of the e-mail.
The message arrives in the POP3 (Post Office Protocol version 3) mailbox at the recipient's Internet service provider. On the next occasion that the recipient uses their e-mail client application to receive the messages, the POP3 box is contacted and the messages copied to the mail client. Normally these are then deleted from the POP3 mail box. The messages can then be read.
The confidentiality of the document relies on a number of unconnected security measures. First, the e-mail client contains an address book that links the commonly recognized names and links them with the e-mail addresses that they have publicized. Should the security on this program be breached, a false address could be substituted and materials intended for a specific recipient could be forwarded to a third party who may stand to gain unfairly or do harm.
When a majority of users connect to the Internet through their conventional supplier, data is sent in an unencrypted format. These formats are often open Internet standards that can be intercepted and interpreted by others. The SMTP server in this example would be considered in this light. When the file arrives in the POP3 box it is vulnerable to anyone able to gain specific user access or supervisor privileges to access the system. It is waiting to be transferred and is vulnerable at this time.
The message is then transferred to the recipient's e-mail program where in normal terms it is available to any person using that machine. It is also freely available in case of theft of the machine. It is not always the case that the message is deleted from the POP3 server when it is transferred to the e-mail program.
(2) Intranet/Web Publishing
Turning to the second example, in order to publish the data on a visual browsing system such as the world-wide web, the following steps would typically be followed.
The document containing the information would first have to be created by the appropriate tool. This tool would create a file containing the information and place it on the file system of an appropriate machine that has a permanent connection to the Internet or the network to which the recipient has access.
This computer will have installed upon it a piece of software known as a web server. The purpose of this software is to create an area of the computer that is open to public access over the network or Internet. It is through this piece of software that access controls may be implemented, it also translates the location of the file on the host computer's file system to a system more accessible from the greater network or Internet.
The location of the file should then be transmitted to the intended recipient. This would normally be done verbally or by e-mail in the case of simple one-to-one notification or by broadcast media in the case of mass transmittal.
The intended recipient would then utilize their computer by operating a piece of software called a web browser that is intended for viewing content ‘served’ to it by a web server that forms part of the structure of the world-wide web. The command required to view the content required is actioned by typing the provided and specific address into the appropriate section of the interface of the software. The browser then searches through whatever the standardized addressing system is appropriate for the network being accessed and returns the file to the recipients computer by calling it from the host's web server. The file is displayed to the recipient's display system and stored in a temporary cache in order to save time and possible expensive bandwidth. This cache is only active for a specified period of time, after which the file is deleted.
Transferring the file to an Internet computer is done by an open transmission system such as TCP/IP or FTP. The web server would typically have a firewall which detects unauthorized access while in contact with other computers on the Internet. Also, the computer would require an Intrusion detection system in order that unauthorized access via the Internet or other network to which the computer is connected if the confidentiality of the document was to be ensured. This system should also be aware of unofficial logins on the same site and through temporary connections including but not limited to Telnet or dial-up connections.
In cases where the web server holding the document is a cluster of computers, the maintenance of confidentiality becomes even more of a problem and an appropriate secure access control system would be needed.
By transmitting the address at which the file can be found over open networks such as e-mail or by embedding a link to the file in a web page, the possibilities of unauthorized or undesirable access increase dramatically.
When the intended recipient types the address of the file into the appropriate input area in the web browsing software of their computer, or if they click on a link to that file they are being monitored by the organization or individual supplying them with their Internet access service, giving rise to a possible breach of confidentiality. Moreover, the recipient's computer will likely have a multitude of software programs installed and neither the author of the document nor even the owner of the computer have any real control over the activities in which these software programs may participate. The user of the software rarely knows exactly how the software works, nor are they aware of the fact that most software programs are actually packages of many programs designed to do a specific job. The smaller programs within these packages are usually unknown to the user and their functionality is unclear. This requires that in order to be totally secure, the organization creating and/or distributing the file would have to control what software packages are installed on all of their intended recipients. This is impractical.
(3) File System Distribution
In the third example, assuming the creator of the file has access to the file system of the intended recipient, it would be feasible to place a copy or the original version of the file onto their system in order that the file may be opened by an appropriate editing or viewing software program. It is also possible that a computer operating system-specific ‘shortcut’ (essentially the network address) that would enable a single copy of the file to be accessed. This copy of the shortcut or file would be likely to be placed on the recipient's virtual desktop, a metaphor used by most visual operating systems that sets aside an area for temporary work space.
Once the document has been created using an appropriate application, it would likely be placed on the creators local computer for access in cases of no network service being available. This computer would need to have an access control system and some functionality that will identify potential users of the machine if the confidentiality of the document is to be secured.
Since the computer on which the file resides would have to be connected to a network, this network will have to have its own intrusion security and access control system.
In order for the file to be actioned by the intended recipient, they too will have to have access to a computer on the network. It is likely that this second machine will have the same security system as that being used by the author of the file. There would also be a system in place on the recipient computer that prevents unwanted programs from accessing the resources on the system. This would work regardless of the type of file being deposited as an extra precaution due to the fact that it is possible to disguise executable files that could harm the system as harmless executable or data files. It is also possible to embed executable functionality into otherwise benign data files.
When actually placing either a copy or a link to the file on the second machine, access privileges would have to be set in order to allow that to happen. The file or link would then have to be transported across the network and placed upon the recipient's system. During this time the system should be monitored for unwanted outside interventions.
Thus it can be seen that in each of the above three examples a variety of systems and security measures must be adopted in order to provide something approaching secure distribution of the document in an attempt to ensure that the document could only be accessed by those to whom access has been granted.
These numerous steps performed by disparate and open tools require security that cannot easily be controlled by a single policy, nor can it be guaranteed as safe due to the complexity involved.
Another problem facing existing data systems is the creation of multiple document copies. When a document is sent in the form of a file, it is a copy that is sent to the point of use by the recipient, a copy being retained by the originator. This is for reasons of access by the originator and in order to increase functional speed of the system by avoiding the delays caused by remote accessing.
Publishing on an Internet and intranet server can cause a duplication problem if the document needs to be edited by the recipient.
In either case, if more than one copy of a document exists and one of those copies is modified, then there exists a problem where documents that are referred to and considered as one entity are in fact multiple, unconnected entities.
With this system, serious errors are possible, even with the most refined human systems.
Another attribute of existing systems is that distributors of data have to build their own systems for access and use of the data. In particular, in order to be able to distribute data electronically to a wide audience, a full navigational interface would normally have to be designed, built and deployed on the world-wide web. This approach provokes unnecessary overheads in terms of time and money. At present there is no way to simply deploy data in an appropriate setting and have it available to a mass audience.
Moreover, end users often do not understand the method of delivery and can therefore not apply the solution to their own situation. This is because with current data systems, data can only be extracted and utilized by the use of a computer language. For instance, in the case of database systems a common language for querying data is SQL (sometimes pronounced sequel), Structured Query Language. From the web, data can be extracted, but not queried by the use of the underlying language, html or hyper text mark-up language. A more recent development of mark-up language, XML (extensible mark-up language), offers the possibility of selecting, querying and outputting data in a similar manner to SQL.
The handling and manipulation of data is achieved by scripting within pre-built applications using the languages above, however it would be necessary to build a proper application if the complexity of the solution required exceeded certain levels. These levels are generally where complex extractions that require decisions to be made by an operator or where the data has to be converted to another form before use.
The languages are complex and require a great deal of experience in the use and configuration of computer systems before they can be learned. They are specialist computer languages that do not necessarily follow natural ergonomic laws and are therefore not easily learned by those who have little technical interest. If a person wishes to create an automated system for their own use, they would have to learn some or all of the above-mentioned languages and systems and build their solution from these.