The Internet is a worldwide network of computers and computer networks arranged to allow the easy and robust exchange of information between users of computers. Hundreds of millions of people around the world have access to computers connected to the Internet via Internet Service Providers (ISPs). Content providers place multimedia information, i.e. text, graphics, sounds, and other forms of data, at specific locations on the Internet referred to as websites. The combination of all the websites and their corresponding webpages on the Internet is generally known as the World Wide Web (WWW) or simply web.
Websites may be created using HyperText Markup Language (HTML) to generate a standard set of tags that define how the webpages for the website are to be displayed. Users of the Internet may access content providers' websites using software known as an Internet browser, such as MICROSOFT INTERNET EXPLORER or NETSCAPE NAVIGATOR. After the browser has located the desired webpage, it requests and receives information from the webpage, typically in the form of an HTML document, and then displays the webpage content for the user. The user may then view other webpages at the same website or move to an entirely different website using the browser.
Websites allow businesses and individuals to share their information with a large number of Internet users. Further, many products and services are offered for sale on the Internet, thus elevating the Internet to an essential tool of commerce.
Electronic mail or email is another important part of the Internet. Email messages may contain, for example, text, images, links, and attachments. Email is one of the most widely used methods of communication over the Internet due to the variety of data that may be transmitted, large number of available recipients, speed, low cost and convenience.
Email messages may be sent, for example, between friends, family members or between coworkers thereby substituting for traditional letters and office correspondences in many cases. This is made possible because the Internet has very few restrictions on who may send emails, the number of emails that may be transmitted and who may receive the emails. The only real hurdle for sending emails is the requirement that the sender must know the email address (also called network mailbox) of the intended recipient.
Email messages travel across the Internet, typically passing from server to server, at amazing speeds achievable only by electronic data. The Internet provides the ability to send an email anywhere in the world, often in less than a few seconds. Delivery times are continually being reduced as the Internet's ability to transfer electronic data improves.
Most internet users find emails to be much more convenient than traditional mail. Traditional mail requires stamps and envelopes to be purchased and a supply maintained, while emails do not require the costs and burden of maintaining a supply of associated products. Emails may also be sent with the click of a few buttons, while letters typically need to be transported to a physical location, such as a mail box, before being sent.
Once a computer and an Internet connection have been purchased, there are typically few additional costs associated with sending emails. This remains true even if millions, or more, of emails are sent by the same user. Emails thus have the extraordinary power of allowing a single user to send one or more messages to a very large number of people at an extremely low cost.
The Internet has become a very valuable tool for business and personal communications, information sharing, commerce, etc. However, some individuals have abused the Internet. Among such abuses are phishing, spam, and posting of illegal content on a website (e.g. child pornography). Phishing is the luring of sensitive information, such as passwords, credit card numbers, bank accounts and other personal information, from an Internet user by masquerading as someone trustworthy with a legitimate need for such information. Spam or unsolicited email is flooding the Internet with many copies of the identical or nearly identical message, in an attempt to force the message on people who would not otherwise choose to receive it. Most spam is commercial advertising, often for dubious products, get-rich-quick schemes, or quasi-legal services.
A single spam message received by a user uses only a small amount of the user's email account's allotted disk space, requires relatively little time to delete and does little to obscure the messages desired by the user. Even a small number of spam messages, while still annoying, would nonetheless cause relatively few real problems. However, the number of spam transmitted over the Internet is growing at an alarming rate. While a single or small number of spam messages are annoying, a large number of spam can fill a user's email account's allotted disk space thereby preventing the receipt of desired emails. Also, a large number of spam can take a significant amount of time to delete and can even obscure the presence of desired emails in the user's email account.
Spam currently comprises such a large portion of Internet communications that they actually cause data transmission problems for the Internet as a whole. Spam creates data log jams thereby slowing the delivery of more desired data through the Internet. The larger volume of data created by spam also requires the Internet providers to buy larger and more powerful, i.e. more expensive, equipment to handle the additional data flow caused by the spam.
Spam has a very poor response rate compared to other forms of advertisement. However, since almost all of the costs/problems for transmitting and receiving spam are absorbed by the recipient of the spam and the providers of the hardware for the Internet, spam is nevertheless commercially viable for a spammer due to the extremely low cost of transmitting the spam.
There are various techniques used for combating Internet abuses. Among them: secure certificates, spam filtering, email challenge-response systems, etc. To obtain a secure certificate a Certification Authority usually authenticates the owner of the domain name, thus allowing the owner of the domain name to employ one of the encryption protocols, e.g. SSL (Secure Socket Layer), for Internet communications. Spam filtering may utilize keywords, various probability algorithms, or white and/or black lists for email addresses, domain names, and/or IP (Internet Protocol) addresses, etc.
Below are a few examples of the systems (some reputation-based) that combat spam.
The SENDERBASE system keeps track of the amount of email messages originating from various domain names and IP addresses. IRONPORT SYSTEMS INC., a company that maintains SENDERBASE.ORG, explains how it works in this example: “If a sender has high global volumes of mail—say 200 Million messages per day—from a network of 5 different domains and 1,700 IP addresses that have only been sending mail for 15 days yet have a high end user complaint rate and they don't accept incoming mail, they will have a very low reputation score [. . . ]. If a sender is a Fortune 500 company, they will likely have much more modest global email volumes—say 500,000 messages per day—will have a smaller number of IPs and domains with a long sending history, they will accept incoming email and have low (or zero) end user complaint rates.”
The Bonded Sender Program maintains a white list-like service. The participants of the service must adhere to the rules and post a bond to be included on the white list.
SPAMCOP maintains a black list of IP addresses and allows users to report spam to a centralized database.
Multiple solutions are created for establishing “societies” of trusted users. Some solutions keep track of user reputation or trust level.
CLOUDMARK, Inc. provides spam filtering and allows users to block or unblock messages manually. The users' votes on messages (blocking and unblocking) are reported to a centralized database, allowing for better spam filtering by reducing the number of false positives. Each CLOUDMARK user is assigned with a reputation (trust rating). If a malicious user unblocks a spam message, while a large number of other users block it, the malicious user's reputation will go down. If a user votes along the lines with the rest of the users, her/his reputation raises.
VERISIGN, Inc. maintains the list of domain names that were issued a VERISIGN SSL digital certificate, so called “Verified Domains List.” The company plans to make the list accessible to third parties.
Some systems suggest publishing reputation data in the DNS (Domain Name System) records, e.g. Mailbox Reputation Network.
For the reputation-based systems to work properly, the sender's email address or at least its domain name part should be correct. Often malicious users forge (spoof) the sender's email address when they send out spam, viruses, or phishing email messages. Among the solutions to this problem are MICROSOFT's Sender ID and YAHOO's Domain Keys. The Sender ID proposal envisions publishing the sender's email IP address in the DNS records of the sender's server. This allows the receiver of the email message to compare the originating IP address in the email with the IP address published in the DNS. If they don't match, the email address was forged. The Domain Keys proposal utilizes public-private key infrastructure. The sender publishes its public key in the DNS records and digitally signs outgoing email messages with its private key. The receiver can validate the sender's signature using the sender's public key published in the DNS records.
A common mechanism for providing increased security includes the use of encrypted transactions using digital certificates (also known as secure certificates). One widely used security protocol is the Secure Socket Layer (SSL) protocol, which uses a hybrid public-key system in which public-key cryptography is used to allow a client and a server to securely agree on a secret session key.
SSL is a networking protocol developed by Netscape Communications Corp. and RSA Data Security, Inc. to enable secure network communications in a non-secure environment. More particularly, SSL is designed to be used in the Internet environment, where it operates as a protocol layer above the TCP/IP (Transmission Control Protocol/Internet Protocol) layers. The application code then resides above SSL in the networking protocol stack. After an application (such as an Internet browser) creates data to be sent to a peer in the network, the data is passed to the SSL layer where various security procedures are performed on it, and the SSL layer then passes the transformed data to the TCP layer. On the receiver's side of the connection, after the TCP layer receives incoming data it passes that data upward to the SSL layer where procedures are performed to restore the data to its original form. That restored data is then passed to the receiving application. The SSL protocol is described in U.S. Pat. No. 5,657,390 entitled “Secure Socket Layer Application Program Apparatus and Method.” Multiple improvements to the SSL protocol were made in the Transport Layer Security (TLS) protocol, which is intended to gradually replace the SSL.
The protocols underlying the Internet (TCP/IP, for example) were not designed to provide secure data transmission. The Internet was originally designed with the academic and scientific communities in mind, and it was assumed that users of the network would be working in a non-adversarial, cooperative manner. As the Internet began to expand into a public network, usage outside these communities was relatively limited, with most of the new users located in large corporations. These corporations had the computing facilities to protect their users' data with various security procedures, such as firewalls, that did not require security to be built into the Internet itself. In the past several years, however, Internet usage has skyrocketed. Millions of people now use the Internet and the Web on a regular basis. These users perform a wide variety of tasks, from exchanging electronic mail messages to searching for information to performing business transactions. These users may access the Internet from home, from their cellular phone, or from a number of other environments where security procedures are not commonly available. To support the growth of the Internet as a viable place of doing business, often referred to as “electronic commerce” or simply “e-commerce”, easily-accessible and inexpensive security procedures had to be developed. SSL is one popular solution, and is commonly used with applications that send and receive data using the HyperText Transfer Protocol (HTTP). HTTP is the protocol most commonly used for accessing that portion of the Internet referred to as the Web. When HTTP is used with SSL to provide secure communications, the combination is referred to as HTTPS. Non-commercial Internet traffic can also benefit from the security SSL provides. SSL has been proposed for use with data transfer protocols other than HTTP, such as Simple Mail Transfer Protocol (SMTP) and Network News Transfer Protocol (NNTP).
SSL is designed to provide several different but complementary types of security. First is message privacy. Privacy refers to protecting message content from being readable by persons other than the sender and the intended receiver(s). Privacy is provided by using cryptography to encrypt and decrypt messages. SSL uses asymmetric cryptography, also known as public-key cryptography (at least for establishing the connection or the so called “handshake”). A message receiver can only decrypt an encrypted message if the message creator used the message receiver's public key to encrypt the message and the message receiver uses his private key to decrypt the message.
Second, SSL provides data integrity for messages being transmitted. Data integrity refers to the ability for a message recipient to detect whether the message content was altered after its creation (thus rendering the message untrustworthy). A message creator passes the message through an algorithm which creates what is called a “message digest”, or a “message authentication code”. The message digest is a large number produced by applying hash functions to the message. A digitally signed digest is sent along with the message. When the message is received, the receiver also processes the message through the same algorithm, creating another digest. If the digest computed by the receiver does not match the digest sent with the message, then it can be assumed that the message contents were altered in some way after the message was created.
The third security feature SSL provides is known as authentication. Communications over the Internet take place as a sequence of electronic signals, without the communicating parties being able to see each other and visually determine with whom they are communicating. Authentication is a technique that helps to ensure that the parties are who they represent themselves to be, whether the party is a human user or an application program. For example, if a human user is buying goods over the Internet using a credit card, it is important for the human user to know that the application waiting on the other end of the connection for his credit card information is really the vendor he believes he is doing business with, and not an impostor waiting to steal his credit card information.
One advantage of SSL is that it is application protocol independent. A higher level protocol can layer on top of the SSL Protocol transparently. Thus, the SSL protocol provides connection security where encryption is used after an initial handshake to define a secret key for use during a session and where the communication partner's identity can be authenticated using, for example, a well known public certificate issuing authority. Examples of such well known Certification Authorities (CA) include Starfield Technologies, Inc. (a subsidiary of The Go Daddy Group, Inc.), RSA Data Security, Inc., VERISIGN, and EQUIFAX.
Authentication is important in establishing the secure connection as it provides a basis for the client to trust that the server, typically identified by its Universal Resource Locator (URL), is the entity associated with the server public key provided to the client and used to establish the secret session key. As noted above, this authentication may be provided through the use of certificates obtained by the server from one of the well known Certification Authorities. The certificate (such as a X.509 certificate) typically includes an identification of the server (such as its hostname), the server's public key, and a digital signature which is provided by the well known Certification Authority. The digital signature is used by a client receiving the certificate from a server to authenticate the identity of the server before initiating a secured session. In particular, the application on the client initiating the secured communication session, such as an Internet browser, is typically installed with a public key ring including public keys for various well known Certification Authorities that allow the client to verify server certificates issued by these Certification Authorities.
Typically a Certification Authority verifies a subscriber (also known as a requester) before a secure certificate is issued. The verification may include checking the person's identity, address, telephone number, email address, ownership of a domain name, etc. Companies and organizations may be verified by checking if they are properly registered with the appropriate governmental agencies. A Certification Authority may access various databases to verify a person or organization, make phone calls to verify telephone numbers, send email messages to verify email addresses, request copies of person's ID or registration documents for companies and organizations, etc.
A Certification Authority may issue various levels (types) of secure certificates. The secure certificate level typically indicates the rigorousness with which the subscriber was verified.