1. Technical Field
The invention disclosed broadly relates to data processing systems and methods and more particularly relates to systems and methods for finite state machine processing in a multimedia data communications environment and to systems and methods for use in a data processing system to enhance network security.
2. Related Patents and Patent Applications
This patent application is related to the copending U.S. patent application Ser. No. 08/024,572, filed Mar. 1, 1993, entitled "Information Collection Architecture and Method for a Data Communications Network," by J. G. Waclawsky, et al., assigned to the IBM Corporation and incorporated herein by reference.
This patent application is also related to the copending U.S. patent application, Ser. No. 08/024,575, filed Mar. 1, 1993, entitled "Event Driven Interface for a System for Monitoring and Controlling a Data Communications Network," by P. C. Hershey, et al., assigned to the IBM Corporation and incorporated herein by reference.
This patent application is also related to the copending U.S. patent application, Ser. No. 08/024,542, filed Mar. 1, 1993, entitled "System and Method for Configuring an Event Driven Interface and Analyzing Its Output for Monitoring and Controlling a Data Communications Network," by J. G. Waclawsky, et al., assigned to the IBM Corporation and incorporated herein by reference.
This patent application is also related to the copending U.S. patent application, Ser. No. 08/138,045, filed Oct. 15, 1993, entitled "System and Method for Adaptive, Active Monitoring of a Serial Data Stream Having a Characteristic Pattern," by P. C. Hershey, et al., assigned to the IBM Corporation and incorporated herein by reference.
This patent application is also related to U.S. Pat. No. 4,918,728, issued Apr. 17, 1988 entitled "Data Cryptography Operations Using Control Vectors" by S. M. Matyas, et al., assigned to the IBM Corporation and incorporated herein by reference.
This patent application is also related to the copending U.S. patent application, Ser. No. 08/004,872, filed Jan. 19, 1993, entitled "An Automatic Immune System for Computers and Computer Networks," by W. C. Arnold, et al., assigned to the IBM Corporation and incorporated herein by reference.
This patent application is also related to copending U.S. patent application, Ser. No. 08/004,871, filed Jan. 19, 1993, entitled "Methods and Apparatus for Evaluating and Extracting Signatures of Computer Viruses and Other Undesirable Software Entities," by J. O. Kephart, assigned to the IBM Corporation and incorporated herein by reference.
3. Background Art
Network security is largely concerned with (1) protecting information from unauthorized disclosure (i.e., information security), (2) protecting information from unauthorized modification or destruction (i.e., information integrity), and (3) ensuring the reliable operation of the computing and networking resources. Cryptography is often used to protect the secrecy and integrity of stored and transmitted data. Ensuring the reliable operation of computing and networking resources is fundamentally a harder problem to solve--one must ensure the functional correctness of the system. The term "reliable operation" means that a system operates correctly, in the way it was intended.
A well-rounded approach to computer and network security will balance the use of cryptographic techniques--for protecting the secrecy and integrity of data--and monitoring techniques--for detecting anomalous network conditions (security events) that may signal the presence of an intruder or intruder agent. A network security administrator, so-notified of a potential intruder or intruder agent, may take one or more possible actions in response to such a notification. Provided that the response is timely, the possible harmful effects of an intruder or intruder agent may be prevented or minimized or localized. Detecting a problem, or potential problem, seems to be the first step in coping with the problem.
Today's high-speed (gigabit per second) multimedia networks consisting of WANs (wide area networks) and LANs (local area networks) can be thought of as a single computing resource comprised of many smaller computing resources spread over a large geographical area. The network as a whole provides network-wide services to its users, in a transparent fashion. It must be capable of communicating voice, image, and text, to name a few. In this new environment, the ability to monitor data flowing over a network, and the ability to react to anomalous conditions, in real time through appropriate network-level actions, seems fundamental to the maintenance of reliable network-wide services.
The co-pending patent application by Hershey et al. entitled "System and Method for Adaptive, Active Monitoring of a Serial Data Stream Having a Characteristic Pattern", Ser. No. 08/138,045 describes a programmable method for detecting characteristic data patterns of diverse size transmitted over high-speed data links. Unlike more traditional method in data is sampled and stored in a log, the finite state machine (FSM) information monitoring means of the Hershey invention, cited above, is programmed to "look" for interesting patterns of concern. In this way, the FSM discards most of the high-speed information bits and concentrates only on patterns of interest. In short, the FSM signals a pattern match as opposed to collecting and storing data in a log, which must then be processed by some other network function. The FSM information monitoring means is coupled to the network, and in response to detecting a prescribed pattern, outputs a control signal to the network to alter communication characteristics thereof. How this control signal is handled depends on the application supported by the FSM information monitoring means. The Hershey et al. application, Ser. No. 08/138,045 is limited in its teaching of how systems can respond to the detection of prescribed patterns, concentrating more of the problem of pattern detection itself. Moreover, the Hershey application, referred to above as well as other prior art, does not teach a unified method for (a) monitoring of security events--virus patterns, natural language patterns, and intrusion detection patterns--on high-speed communication links, (b) reporting detected security events to a network security manager, and (c) responding on a network-level to detected security events as a means to thwart, counter, minimize, or isolate their possible harmful effects.
3.1 Virus Detection
A virus is a computer program that (1) propagates itself through a system or network of systems and (2) appears to the user to perform a legitimate function but in fact carries out some illicit function that the user of the program did not intend. See M. Gasser, "Building a Secure Computer System," Van Nostrand Reinhold, N.Y., 1988. A computer virus has been defined by Frederick B. Cohen (A Short Course on Computer Viruses, page 11), as a program that can infect other programs by modifying them to include a, possibly evolved, version of itself. As employed herein, a computer virus is considered to include an executable assemblage of computer instructions or code that is capable of attaching itself to a computer program. The subsequent execution of the viral code may have detrimental effects upon the operation of the computer that hosts the virus. Some viruses have an ability to modify their constituent code, thereby complicating the task of identifying and removing the virus. Another type of undesirable software entity is known as a "Trojan Horse." A Trojan Horse is a block of undesired code that is intentionally hidden within a block of desirable code. A virus is typically identified by a `signature`, i.e., a sequence of data bits sufficient to distinguish the virus from other data, or sufficient to raise a warning flag that a virus may be present. In the latter case, further checking and identification of the candidate virus must be performed.
Viruses can be detected by two primary means: (1) modification detection and (2) pattern detection via a scanner. With modification detection, checksums or cryptographic hash values are used to detect changes in executable codes. These changes are reported to a system manager who then decides whether the change is expected (e.g., due to a recent software upgrade) or unexpected (e.g., due to viral infection or unauthorized modification). This method usually requires manual intervention to add, delete, or modify system files in order to ensure adequate coverage and to limit the number of false alarms. A list of checksums must be maintained for all files to be protected. This method is not practical in a high-speed communications environment for several reasons: (1) the overhead imposed by computing checksums, (2) the unpredictability of data flowing on the communications medium, and (3) the requirement for transporting and storing reference checksums for use in comparing with the computed checksums.
A widely-used method for the detection of computer viruses and other undesirable software entities is known as a scanner. A scanner searches through executable files, boot records, memory, and any other areas that might harbor executable code, for the presence of known undesirable software entities. Typically, a human expert examines a particular undesirable software entity in detail and then uses the acquired information to create a method for detecting it wherever it might occur. In the case of computer viruses, Trojan Horses, and certain other types of undesirable software entities, the detection method that is typically used is to search for the presence of one or more short sequences of bytes, referred to as signatures, which occur in that entity. The signature(s) must be chosen with care such that, when used in conjunction with a suitable scanner, they are highly likely to discover the entity if it is present, but seldom give a false alarm, known as a false positive. The requirement of a low false positive rate amounts to requiring that the signature(s) be unlikely to appear in programs that are normally executed on the computer. Typically, if the entity is in the form of binary machine code, a human expert selects signatures by transforming the binary machine code into a human-readable format, such as assembler code, and then analyzes the human-readable code. In the case where that entity is a computer virus, the expert typically discards portions of the code which have a reasonable likelihood of varying substantially from one instance of the virus to another. Then, the expert selects one or more sections of the entity's code which appear to be unlikely to appear in normal, legitimate programs, and identifies the corresponding bytes in the binary machine code so as to produce the signature(s). The expert may also be influenced in his or her choice by sequences of instructions that appear to be typical of the type of entity in question, be it a computer virus, Trojan horse, or some other type of undesirable software entity.
With pattern detection via a scanner, system files are periodically scanned for patterns, which consist of a set of pre-defined virus "signatures." Pattern matches are reported to the system manager who then decides whether the match represents a misdiagnosis or an actual viral infection. A virus consists of one or more fixed-length signature patterns, so the number of virus signatures is proportional to the number of viruses. A list of virus signatures must be maintained for each virus. The pattern search usually proceeds in a serial fashion, scanning each file one at a time, comparing the records of the file with each signature pattern in turn. This form of pattern detection is not suitable for a high speed communications environment because of the delay caused by the serial, fixed-signature search pattern. In a high speed communications environment, it would be desirable to search for many different signature patterns in parallel.
Currently, a number of commercial computer virus scanners are successful in alerting users to the presence of viruses that are already known. However, scanners may not be able to find computer viruses for which they have not been programmed explicitly. The problem of dealing with new viruses has typically been addressed by distributing updates of scanning programs and/or auxiliary files containing the necessary information about the latest viruses. However, the increasing rate at which new viruses are being written is widening the gap between the number of viruses that exist and the number of viruses that can be detected by an appreciable fraction of computer users. Thus it is becoming increasingly likely that a new virus will become wide-spread before any remedy is generally available.
It has become clear to many people in the industry that methods for automatically recognizing and eradicating previously unknown or unanalyzed viruses must be developed and installed on individual computers and computer networks. There are a number of articles addressing this problem. In an article entitled "Automated Program Analysis for Computer Virus Detection," by W. C. Arnold, et al, IBM Technical Disclosure Bulletin, July 1991, page 415, is directed to the potential behavior of program objects to determine heuristically whether they may contain computer viruses or similar threats.
An article entitled "The SRI IDES Statistical Anomaly Detector," H. S. Javitz and A. Valdes, Proceedings of the 1991 IEEE Computer Society Symposium on Research in Security and Privacy, pp. 316-326 is directed to a statistical approach to anomaly detection.
An article entitled "Towards a Testbed for Malicious Code Detection," by R. Lo, P. Kerchen, R. Crawford, W. Ho, and J. Crossley, Lawrence Livermore National Lab Report UCRL-JC-105792, 1991, which describes static and dynamic analysis tools which have been shown to be effective against certain types of malicious code. Such an idea represents another form of anomaly detection.
Copending U.S. patent application Ser. No. 08/004,872, filed Jan. 19, 1993, entitled "An Automatic Immune System for Computers and Computer Networks," by W. C. Arnold, et al., cited above in Related Patents and Patent Applications, describes a method by which a computer can detect the presence of and respond automatically to a computer virus or other undesirable software entity. If the computer is connected to others via a network, it can warn its neighbors about that entity and inform them about how to detect it. The invention provides methods and apparatus to automatically detect and extract a signature from an undesirable software entity, such as a computer virus or worm. It further provides methods and apparatus for immunizing a computer system, and also a network of computer system, against a subsequent infection by a previously unknown and undesirable software entity.
Copending U.S. patent application Ser. No. 08/004,871, filed Jan. 19, 1993, entitled "Methods and Apparatus for Evaluating and Extracting Signatures of Computer Viruses and Other Undesirable Software Entities," by J. O. Kephart, cited above in Related Patents and Patent Applications, describes an automatic computer implemented procedure for extracting and evaluating computer virus signatures. It further provides a statistical computer implemented technique for automatically extracting signatures from the machine code of a virus and for evaluating the probable effectiveness of the extracted signatures in identifying a subsequent instance of the virus.
The described methods of virus detection (modification detection and pattern detection) are based on identifying a viral infection in a stored form of the data--after the infection has already taken place. A different, highly parallel method is required in order to detect the transfer of viral agents across a high-speed communications link where one "looks" for a viral pattern as it flashes past a monitor attached to the bit stream. The Hershey, et al. adaptive, active monitor described in copending U.S. patent application, Ser. No. 08/138,045, filed Oct. 15, 1993, entitled "System and Method for Adaptive, Active Monitoring of a Serial Data Stream Having a Characteristic Pattern," cited above in Related Patents and Patent Applications, is particularly well-suited for scanning of virus signatures in a high-speed communications environment. The prior art with respect to virus detection, virus scanning, signature preparation, virus reporting, etc. is well defined and described as pointed out above. However, the prior art does not teach how to construct a virus scanning apparatus well-suited to very high-speed networks, and more particularly how such an apparatus could be constructed for attachment to a bit stream as a singular entity, integrated within a network-attached device or standing alone, whose purpose is to act as a monitoring and responding device for assuring the integrity and security of a high-speed communications network.
3.2 Natural Language Detection
Natural language detection admits a range of applications which include (1) detection of inappropriate word use in a business environment (e.g., 4-letter words), (2) detection of inappropriate discourse within a business environment (e.g., use of company computer resources for conducting personal business), (3) detection of sensitive words such as the words "Company Confidential" that may signal the transmission of clear information that should be encrypted or the words "copyright protected" that may signal a possible violation of copyright law, and (4) detection of clear versus encrypted data that may signal a possible violation of company policy requiring all traffic on a link to be encrypted.
Each of the natural language detection applications has a range of possible actions that may be taken in response to a detected offending pattern.
3.3 Intrusion Detection
One form of intrusion of particular concern to network security is an adversary who attempts to gain access to a system by issuing repeated login requests. In this case, intrusion detection is aimed at detecting a higher-than-normal frequency of login sequences indicating that someone is repeatedly attempting to login by guessing userids and passwords. In this security application, one does not merely detect the presence of a pattern but the presence of a higher-than-normal frequency of patterns.
Most security applications (including virus detection, natural language detection, and intrusion detection) consist of a detection step and a response step. The Hershey FSM information monitoring means described in U.S. pending patent application Ser. No. 08/138,045, cited above, is particularly suited as a pattern detection means in a high-speed communication environment. Yet for the Hershey FSM information monitoring means to be well-suited as a security device in a high-speed communication environment, it must be adapted to search for patterns particular to security applications and it must be extended to provide a capability for responding, in appropriate ways, to detected patterns. Such a security device (hereinafter called a security agent) must provide both an information monitoring function as well as a real-time responding function.