Keeping information hidden from hostile parties is essential in many contexts, including business, government, or the military. However, such organizations are relying more and more on the increased efficiencies provided by powerful computers and computer networks, which leaves their information resources vulnerable to theft and tampering attacks.
One particular target of such attacks is the information stored in various mass data forms. Mass data refers to the contents of arrays, large data structures, linked data structures, and data structures and arrays stored in memory allocated at run-time via calls to allocation facilities such as the C™ language standard utility function malloc( ). Mass data also refers to data stored in mass storage devices other than main memory, such as file data stored on rotating media such as hard disks, floppy disks, or magnetic drums, and streaming media such as magnetic tape drives, as well as other forms of mass storage such as CD-ROMs, EEPROMs (electrically erasable programmable read only memories) and other kinds of PROMs (programmable read only memories), and magnetic bubble memories. Other forms of mass storage media will be clear to those skilled in the art.
Much information about the purpose, intent, and manner of operation of a computer program can be obtained by observation of its mass data (arrays, data structures, linked structures with pointers, and files). Moreover, mass data is particularly vulnerable to tampering. The nature of the data and of typical programs makes thorough checking of the data for tampering impractical, so mass data is a common point of attack both for finding information about a program and for changing its behaviour. For example, many programs contain tables of information governing their behaviour, or access files which provide such tables.
As well, it is critical that certain data, such as biometric data, not be accessible to an attacker. A given user only has a finite number of biometric references such as a voice print, thumb print, retina print, signature ballistics via the mouse, and the like. Once these data are compromised, they are never again secure.
There are basically two ways to protect information: by means of physical security, and by means of obscurity.
Physical security keeps information from hostile parties by making it difficult to access. This can be achieved by restricting the information to a very small set of secret holders (as with passwords, biometric identification or smart cards, for example), by use of security guards, locked rooms or other facilities as repositories of the secrets, by ‘firewall’ systems in computer networks, and the like.
The weaknesses of physical security techniques are well known in the art. Passwords and other secrets can be lost or forgotten, thus, it is necessary to provide password recovery systems. These recovery systems are usually provided in the form of a human being who simply provides a new password to a user. The provision of passwords under human control presents many opportunities for the attacker.
Passwords which are complex enough to be secure from simple guessing or a “dictionary attack” (where an attacker simply makes successive access attempts using all words and combinations of words in a dictionary) are usually stored electronically, thus they could be discovered by an attacker. Locked rooms and firewall systems cannot be perfectly secure, and present themselves as high value targets for attack; breaking through a firewall usually gives an attacker access to a great quantity of secure material.
Another proposed method of protecting of mass data is to encode password and encryption systems into microprocessors, so that the microprocessors communicate with external memory, for example, using data in an encrypted form. While such a system offers some physical protection, it cannot implemented on the microprocessors currently used on computers and communication devices as it requires a physical change to the architecture of the microprocessor. Clearly this is an impractical solution in view of the vastness of the existing computer infrastructure.
The other approach is protection by obscurity, that is, by making discovery of the secret information improbable, even if a hostile party can access the information physically.
For example, in cryptography, messages are concealed from attackers by encoding them in unobvious ways. The decoding function is concealed in the form of a key, which is generally protected by physical means. Without knowledge of the key, finding the decoding function among all of the various possibilities is generally infeasible.
In steganography, secret messages are buried in larger bodies of irrelevant information. For example, a secret text message might be concealed in the encoding of a video stream or still image. Steganography is effective because the exact information in the video stream can vary slightly without any significant effect being detectible by a human viewer, and without any noticeable stream-tampering being visible to a hostile party. Again, there is a dependence on a key to indicate the manner in which the information is encoded in the video stream, and this key must be protected by physical security.
In an attempt to protect information embodied in computer programs, several approaches have been used.
For example, it is common within a corporate LAN to protect 3rd-party proprietary software tools by embedding licenses and license-processing in them. The license-processing checks the embedded license information by contacting a license server for validation. This approach is not generally viable outside such relatively safe environments (such as corporate Intranets), because the license-processing is vulnerable to disablement. An attacker need only reverse engineer the software to locate the line of software code that tests whether an access attempt should be allowed, and alter this line to allow all access attempts.
Software and data to be protected can be encrypted, and then decrypted for execution. This approach is quite vulnerable (and has resulted in security breaches in practice) because the software and data must be decrypted in order to execute. With appropriate tools, an attacker can simply access this decrypted image in virtual memory, thereby obtaining a ‘plain-text’ of the program.
Finally, software and data can be encoded in ways which make understanding it, and thereby extracting the information concealed within it, more difficult. For example, one can simply apply techniques which are contrary to the design principles of good software engineering: replacing mnemonic names with meaningless ones, adding spurious, useless code, and the like. These techniques do not provide a rigorous solution to the problem though. A patient observer will ultimately determine how the code is operating using tools which allow the attacker to access and analyse the state of the running program.
There is therefore a need for a system and method which secures mass data from tampering and reverse engineering.