One major problem facing modern computing systems and communications systems is the prevalence of spam and/or scam electronic mail (e-mail), and/or other messages, that include malicious, unwanted, offensive, or nuisance content, such as, but is not limited to: any content that promotes and/or is associated with fraud; any content that includes “work from home” or “be our representative” offers/scams; any content that includes money laundering or so-called “mule spam”; any content that promotes and/or is associated with various financial scams; any content that promotes and/or is associated with any other criminal activity; and/or any content that promotes and/or is associated with harmful and/or otherwise undesirable content, whether illegal in a given jurisdiction or not.
One particularly troublesome, and at times dangerous, form of scam e-mail is the so called “Nigerian 419” message or “419 message”. A typical 419 message is a form of advance-fee fraud in which the target is persuaded to advance sums of money in the hope of realizing a significantly larger gain. The number “419” refers to the article of the Nigerian Criminal Code (part of Chapter 38: “Obtaining Property by false pretences; Cheating”) dealing with fraud. However, as discussed below, 419 messages are a global issue and problem.
A 419 message scam usually begins with an e-mail, or other message, purportedly sent to a selected recipient, but actually sent to many recipients in most cases, making an offer that would result in a large payoff for the victim. The e-mail's subject line often says something like “From the desk of Mr. [Name]”, “Your assistance is needed”, and so on. The details vary, but the usual story is that a person, often a government or bank employee, knows of a large amount of unclaimed money or gold which he cannot access directly, usually because he has no right to it. The sums involved are usually in the millions of dollars, and the investor is promised a large share, typically ten to forty percent, if they assist the scam character in retrieving the money. Whilst the vast majority of recipients do not respond to these e-mails, a very small percentage do, but this is often enough to make the fraud worthwhile as many millions of messages can be sent. Invariably sums of money which are substantial, but very much smaller than the promised profits, are said to be required in advance for bribes, fees, etc. This is the money being stolen from the victim, who thinks he or she is investing to make a huge profit.
419 message scammers often make use of low-volume and/or hand written e-mail messages, i.e., not automatically generated messages, to distribute the scam offer. In addition, 419 messages are often short-lived, i.e., have relatively short distribution times and are often very similar in content and format to legitimate messages. As a result, identifying 419 messages and quarantining them, or otherwise taking preventative/protective action, is often quite difficult.
Currently, methods and procedures for identifying 419 messages typically involve “off-line” analysis. For instance many current methods and procedures for identifying 419 messages rely on samples of 419 messages collected at one or more “honeypot” systems. A honeypot system is typically a decoy e-mail system established on a computing system, such as any computing system discussed herein, and/or known in the art at the time of filing, and/or as developed after the time of filing, to receive a large number of e-mails, and/or other messages, sent to decoy e-mail addresses. Generally, the decoy e-mail addresses don't belong, or no longer belong, to a genuine person or entity. Consequently, the e-mails received by the honeypot via the decoy e-mail addresses are typically not legitimate e-mails from legitimate senders. As a result, at a first cut, it generally is assumed that any e-mails sent to the decoy e-mail addresses, and received at the honeypot, are indeed spam. In operation, as the honeypot decoy e-mail addresses become known to spammers, more and more spammers typically add the spam e-mail honeypot decoy e-mail addresses to their user/victim e-mail address databases and more and more spam e-mails are sent to the spam e-mail honeypot decoy e-mail addresses. Inevitably, a percentage of these spam e-mails are 419 messages and then, once identified, these 419 messages are analyzed to identify common potential 419 message parameters. Currently, the potential 419 message parameters are then distributed to one or more security systems, and/or one or more real, live, e-mail systems, and/or one or more user computing systems, and the potential 419 message parameters are used to identify potential 419 messages and/or to initiate one or more actions to protect one or more users and/or user computing systems.
Another off-line method currently used to identify 419 messages is the use of user input, or another source of input. In some cases, a user of a message system and/or one or more security systems, or some other third party source, provides examples of 419 messages they have received. These 419 messages are then analyzed to identify common potential 419 message parameters. Then, once again, the potential 419 message parameters are distributed to one or more security systems, and/or one or more real, live, e-mail systems, and the potential 419 message parameters are used to identify potential 419 messages and/or to initiate one or more actions to protect one or more users and/or user computing systems.
While the currently used off-line methods and procedures for identifying 419 messages can be effective, these off-line methods suffer from significant time delays between when a 419 message is distributed and when the potential 419 message parameters are used to identify potential 419 messages and/or to initiate one or more actions to protect one or more users and/or user computing systems. This time delay is often on the order of hours, and sometimes on the order of days. This is a significant problem given that the life span of a given 419 message format can be quite short, on the order of minutes, and that every minute between when a 419 message is distributed and when the potential 419 message parameters are used to identify potential 419 messages often means thousands of 419 messages being successfully delivered. In addition, using the current off-line methods and procedures for identifying 419 messages, if a given sample 419 message type is not provided for analysis, either via interception by the honeypot system or from a user or another source, then the given 419 message type is never identified and/or stopped.
As noted above, to a large degree, the significant time delays between when a 419 message is distributed and when the potential 419 message parameters are used to identify potential 419 messages and/or to initiate one or more actions to protect one or more users and/or user computing systems using current methods for identifying 419 messages is a result of the fact that current methods for identifying 419 messages rely on off-line analysis. However, these off-line methods are still currently used because live or “real-time” analysis of messages to identify potential 419 messages is currently considered too expensive in terms of capital equipment, hosting costs, the processing costs, e.g., the processor time and/or cycles, time lag, i.e., the time lag associated with the processing and analysis, inconvenience cost associated with false positive results, database access and access time, disk access time, Input/Output (I/O) latencies, and/or various other costs associated with implementing a 419 identification system in a live e-mail or message stream/system.
As a result of the situation described above, currently, 419 messages remain extremely difficult to identify and isolate and, therefore, many of these harmful, and at times dangerous, e-mails still find their way to thousands of victims each year. Clearly, this is a far from ideal situation for the victims, but it is also a problem for all users of e-mail who must suffer with the delays of false positives and/or must be wary of all e-mails, even those of legitimate origin and intent.