Cloud-based computing is a software deployment model that hosts applications as services for users across the internet. Users' computers (also known as clients) in a cloud-based network communicate via conventional means such as email, short message service (SMS) and instant messaging.
In addition to desirable messages (also referred to as HAM), considerable network traffic is generated by unsolicited messages (SPAM) sent to large numbers of recipients indiscriminately (known as spamming). Typically, SPAM messages advertise products and services, request charitable donations, or broadcast some political or social commentary. SPAM is often unwanted by recipients and is considered a waste of computing resources and network bandwidth. It also causes loss of productivity of the recipients of SPAM. Servers and/or clients include SPAM filters capable of separating SPAM from HAM. The SPAM filter can block or quarantine unknown messages based upon certain criteria such as the inclusion of an unwarranted character string, or based on a personal review by the receiving client.
The SPAM filter may refer to a white list to check whether messages between a sender-receiver pair have already been explicitly identified as HAM. A white list can be a list of specific elements whose inclusion in a message guarantee it will pass the SPAM filter and be delivered. For example, an email white list might allow emails from a particular domain name, messages from identified senders, or messages whose subject contains a specific word or phrase. In a communication system, a white list can contain information about sender-receiver pairs that are allowed to communicate with each other. If the sender-receiver pair is on the white list, the SPAM filter considers the message HAM and transmits it to the recipient. Conversely, a black list contains elements, such as strings or phrases, whose inclusion in a message results in the message being blocked. Black lists are employed extensively to identify SPAM and divert such messages from the receiving client.
SPAM filters are, however, not foolproof and can generate a number of false positives and false negatives. Additionally, because SPAM keeps changing, SPAM filters need to be updated periodically. For cloud-based messaging services, design and learning phases of new SPAM filters require considerable amount of testing and user feedback. For correct identification of SPAM and HAM, manual inspection of the message is often necessary.
It is highly desirable for cloud-based services, which often use multiple SPAM filters and frequent updates to the SPAM filter lists, to assess objectively the effectiveness of SPAM filters by using real-time traffic.