There has been a significant rise in threats which can lead to a processing system being infected with malware. One solution to this problem is the use of software which attempts to detect when a threat occurs in a processing system and to restrict the threat from performing malicious steps in the processing system. However, the drawback to such software is that it must be configured to detect the specific threat. If the threat is slightly varied by a malware author, the software may be unable to detect the threat and thus leave the processing system open to attack.
One solution to this drawback has been to collect information about the source of a threat and to then warn or block a user of the processing system from being exposed to the source of the threat. One implementation to this solution has been the use of “honeyclients” to perform a crawling analysis of entities such as webpages and the like so as to collect information about threats, wherein the results of the analysis are provided to users in order to avoid visiting particular malicious websites and the like.
However, this form of analysis is highly resource intensive, particularly due to the number of webpages that now exist on the Internet and the crawling technique which this solution implements. Additionally, the results of the analysis can be outdated. For example, a particular malicious webpage may be configured to modify the malicious portion of the website within a few hours or less. In such a situation, the honeyclient may perform an analysis of the website in a period of time when the malicious portion of the website has been removed, thereby avoiding detection by the honeyclient. Therefore, such a system is difficult, if not impossible, to implement accurately due to the high variation that occurs with threats, particularly on such a short time scale.
Therefore, there is a need for a method, system and computer program product which enables data to be captured relating to a threat which overcomes or ameliorates one or more of the above-mentioned problems.
As used herein a “threat” includes malicious software, also known as “malware” or “pestware”, which includes software that is included or inserted in a part of a processing system for a harmful purpose. The term threat should be read to include possible, potential and actual threats. Types of malware can include, but are not limited to, malicious libraries, viruses, worms, Trojans, adware, malicious active content and denial of service attacks. In the case of invasion of privacy for the purposes of fraud or theft of identity, malicious software that passively observes the use of a computer is known as “spyware”.
A hook (also known as a hook procedure or hook function) generally refers to a function provided by a software application that receives certain data before the normal or intended recipient of the data. A hook function can thus examine or modify certain data before passing on the data. Therefore, a hook function allows a software application to examine data before the data is passed to the intended recipient.
An API (“Application Programming Interface”) hook (also known as an API interception), a type of hook, refers to a callback function provided by an application that replaces functionality provided by an operating system's API. An API generally refers to an interface that is defined in terms of a set of functions and procedures, and enables a program to gain access to facilities within an application. An API hook can be inserted between an API call and an API procedure to examine or modify function parameters before passing parameters on to an actual or intended function. An API hook may also choose not to pass on certain types of requests to an actual or intended function.
A process is at least one of a running software program or other computing operation, or a part of a running software program or other computing operation, that performs a task.
A hook chain is a list of pointers to special, application-defined callback functions called hook procedures. When a message occurs that is associated with a particular type of hook, the operating system passes the message to each hook procedure referenced in the hook chain, one after the other. The action of a hook procedure can depend on the type of hook involved. For example, the hook procedures for some types of hooks can only monitor messages, others can modify messages or stop their progress through the chain, restricting them from reaching the next hook procedure or a destination window.
A kernel refers to the core part of an operating system, responsible for resource allocation, low-level hardware interfaces, security, etc.
An interrupt is at least one of a signal to a processing system that stops the execution of a running program so that another action can be performed, or a circuit that conveys a signal stopping the execution of a running program.
A system registry is a database used by modern operating systems, for example Windows™ platforms. The system registry includes information needed to configure the operating system. The operating system refers to the registry for information ranging from user profiles, to which applications are installed on the machine, to what hardware is installed and which ports are registered.
A hash function (i.e. Message Digest, eg. MD5) can be used for many purposes, for example to establish whether a file transmitted over a network has been tampered with or contains transmission errors. A hash function uses a mathematical rule which, when applied to a file, generates a hash value, i.e. a number, usually between 128 and 512 bits in length. This number is then transmitted with the file to a recipient who can reapply the mathematical rule to the file and compare the resulting number with the original number.
An entity can include, but is not limited to, a file, an object, a class, a collection of grouped data, a library, a variable, a process, and/or a device.
In a networked information or data communications system, a user has access to one or more terminals which are capable of requesting and/or receiving information or data from local or remote information sources. In such a communications system, a terminal may be a type of processing system, computer or computerised device, personal computer (PC), mobile, cellular or satellite telephone, mobile data terminal, portable computer, Personal Digital Assistant (PDA), pager, thin client, or any other similar type of digital electronic device. The capability of such a terminal to request and/or receive information or data can be provided by software, hardware and/or firmware. A terminal may include or be associated with other devices, for example a local data storage device such as a hard disk drive or solid state drive.
An information source can include a server, or any type of terminal, that may be associated with one or more storage devices that are able to store information or data, for example in one or more databases residing on a storage device. The exchange of information (i.e. the request and/or receipt of information or data) between a terminal and an information source, or other terminal(s), is facilitated by a communication means. The communication means can be realised by physical cables, for example a metallic cable such as a telephone line, semi-conducting cables, electromagnetic signals, for example radio-frequency signals or infra-red signals, optical fibre cables, satellite links or any other such medium or combination thereof connected to a network infrastructure.
The reference in this specification to any prior publication (or information derived from the prior publication), or to any matter which is known, is not, and should not be taken as an acknowledgment or admission or any form of suggestion that the prior publication (or information derived from the prior publication) or known matter forms part of the common general knowledge in the field of endeavour to which this specification relates.