A domain name usually consists of two or more parts (technically labels), separated by dots. For example: example.com.                The rightmost label conveys the top-level domain (for example, the address www.example.com has the top-level domain com).        Each label to the left specifies a subdivision, or subdomain of the domain above it. Note; “subdomain” expresses relative dependence, not absolute dependence. For example: example.com comprises a subdomain of the com domain, and www.example.com comprises a subdomain of the domain example.com. In theory, this subdivision can go down to 127 levels deep. Each label can contain up to 63 characters. The whole domain name does not exceed a total length of 255 characters. In practice, some domain registries may have shorter limits.        A hostname refers to a domain name that has one or more associated IP addresses; ie: the ‘www.example.com’ and ‘example.com’ domains are both hostnames, however, the ‘com’ domain is not.        
DNS Servers
The Domain Name System consists of a hierarchical set of DNS servers. Each domain or subdomain has one or more authoritative DNS servers that publish information about that domain and the name servers of any domains “beneath” it. The hierarchy of authoritative DNS servers matches the hierarchy of domains. At the top of the hierarchy stand the root nameservers: the servers to query when looking up (resolving) a top-level domain name (TLD).
Users generally do not communicate directly with DNS. Instead DNS-resolution takes place transparently in client-applications such as web-browsers, mail-clients, and other Internet applications. When an application makes a request which requires a DNS lookup, such programs send a resolution request to the local DNS resolver in the local operating system, which in turn handles the communications required.
The DNS resolver likely has a cache containing recent lookups. If the cache can provide the answer to the request, the resolver will return the value in the cache to the program that made the request. If the cache does not contain the answer, the resolver will send the request to one or more designated DNS servers.
When a DNS client needs to look up a name used in a program, it queries DNS servers to resolve the name. Each query message the client sends contains three pieces of information, specifying a question for the server to answer:                A specified DNS domain name, stated as a fully qualified domain name (FQDN)        A specified query type, which can either specify a resource record by type or a specialized type of query operation.        A specified class for the DNS domain name.        
For example, the name specified could be the FQDN for a computer, such as “host-a.example.com.”, and the query type specified to look for an address (A) resource record by that name. Think of a DNS query as a client asking a question, such as “Do you have any A resource records for a computer named ‘hostname.example.com.’?” When the client receives an answer from the server, it reads and interprets the answered A resource record, learning the IP address for the computer it asked for by name.
DNS queries resolve in a number of different ways. A client can sometimes answer a query locally using cached information obtained from a previous query. The DNS server can use its own cache of resource record information to answer a query. A DNS server can also query or contact other DNS servers on behalf of the requesting client to fully resolve the name, then send an answer back to the client. This process is known as recursion.
In addition, the client itself can attempt to contact additional DNS servers to resolve a name. In general, the DNS query process occurs in two parts:                A name query begins at a client computer and is passed to a resolver, the DNS Client service, for resolution.        When the query cannot be resolved locally, DNS servers can be queried as needed to resolve the name.        
In the initial steps of the query process, a DNS domain name is used in a program on the local computer. The request is then passed to the DNS service for resolution using locally cached information. If the queried name can be resolved, the query is answered and the process is completed. If the query does not match an entry in the cache, the resolution process continues with the client querying a DNS server to resolve the name.
Querying a DNS Server
A positive response can consist of the queried RR or a list of RRs (also known as an RRset) that fits the queried DNS domain name and record type specified in the query message. The resolver passes the results of the query, in the form of either a positive or negative back to the requesting program and caches the response.
How Caching Works
As DNS servers process client queries using recursion or iteration, they discover and acquire a significant store of information about the DNS namespace. This information is then cached by the server.
Caching provides a way to speed the performance of DNS resolution for subsequent queries of popular names, while substantially reducing DNS-related query traffic on the network.
As DNS servers make recursive queries on behalf of clients, they temporarily cache resource records (RRs). Cached RRs contain information obtained from DNS servers that are authoritative for DNS domain names learned while making iterative queries to search and fully answer a recursive query performed on behalf of a client. Later, when other clients place new queries that request RR information matching cached RRs, the DNS server can use the cached RR information to answer them.
When information is cached, a Time-To-Live (TTL) value applies to all cached RRs. As long as the TTL for a cached RR does not expire, a DNS server can continue to cache and use the RR again when answering queries by its clients that match these RRs. Caching TTL values used by RRs in most zone configurations are assigned the Minimum (default) TTL which is set used in the zone's start of authority (SOA) resource record. By default, the minimum TTL is 3,600 seconds (1 hour) but can be adjusted or, if needed, individual caching TTLs can be set at each RR.
Other Applications
There are many uses of the domain name system (DNS) besides translating names to IP addresses. For example, mail transfer agents use DNS to find out where to deliver e-mail for a particular address. The domain to mail exchanger mapping provided by DNS MX records tells where to deliver email for a domain.
Sender Policy Framework and Domain Keys instead of creating their own record types were designed to take advantage of another DNS record type, the TXT record. In these cases the TXT record contains a policy or a public key.
Protocol Details
DNS primarily uses UDP on port 53 to serve requests. Almost all DNS queries consist of a single UDP request from the client followed by a single UDP reply from the server. TCP comes into play only when the response data size exceeds 512 bytes, or for such tasks as zone transfer. Some operating systems such as HP-UX are known to have resolver implementations that use TCP for all queries, even when UDP would suffice.
Important categories of data stored in DNS include the following:                An A record or address record maps a hostname to a 32-bit IPv4 address.        An AAAA record or IPv6 address record maps a hostname to a 128-bit IPv6 address.        A CNAME record or canonical name record is an alias of one name to another. The A record to which the alias points can be either local or remote—on a foreign name server. This is useful when running multiple services (such as an FTP and a webserver) from a single IP address. Each service can then have its own entry in DNS (like ftp.example.com. and www.example.com.)        An MX record or mail exchange record maps a domain name to a list of mail exchange servers for that domain.        A PTR record or pointer record maps an IPv4 address to the canonical name for that host. Setting up a PTR record for a hostname in the in-addr.arpa domain that corresponds to an IP address implements reverse DNS lookup for that address.        An NS record or name server record maps a domain name to a list of DNS servers authoritative for that domain. Delegations depend on NS records.        An SOA record or start of authority record specifies the DNS server providing authoritative information about an Internet domain, the email of the domain administrator, the domain serial number, and several timers relating to refreshing the zone.        An SRV record is a generalized service location record.        A TXT Record was originally intended to carry arbitrary human-readable text in a DNS record. Since the early 1990s, however, this record is more often used to carry machine-readable data such as specified by RFC 1464, opportunistic encryption, Sender Policy Framework and DomainKeys such as public keys or a policy.        An NAPTR record (“Naming Authority Pointer”) is a newer type of DNS record that support regular expression based rewriting.SMTP Background        
The simple mail transfer protocol (smtp) standardized as RFC2821, is widely used in most stages of delivering e-mail across the internet. The smtp protocol is built on the TCP or transmission control protocol discussed in RFC1180, and consists of commands, code, parameters, and data exchanged between clients and servers. A TCP service transmits packets whose headers contain the internet protocol (IP) address of the sending host and the receiving host.
Although the SMTP protocol provides for relay through a serial chain of clients and servers, in practice today, the sender client makes a direct connection to the receiver's server. Thus the IP header used to establish the handshake cannot be forged.
The envelope sender email address (sometimes also called the return-path) is used during the transport of the message from mail server to mail server, e.g. to return the message to the sender in the case of a delivery failure. It is usually not displayed to the user by mail programs.
The header sender address of an e-mail message is contained in the “From” or “Sender” header and is what is displayed to the user by mail programs. Generally, mail servers do not care about the header sender address when delivering a message. Spammers can easily forge these.
DNSBL Background
An early and initially successful attempt to control unsolicited bulk messages transmitted by email, commonly called spam, was called RBL. Generally, RBL's can be thought of as lists of IP addresses which had been found to have a history of transmitting spam. There are more proper definitions of RBL and more generic terms which are not historical or trademarked but common usage refers to queries that check lists of “bad” IP addresses as RBL-like.
Early attempts to block spam started with the development of a “blacklist” of known IP addresses that sent spam. This blacklist would be referenced and any email originating from one of the IP addresses on the blacklist would be rejected. The IP address is obtained from the TCP/IP packet information and cannot be forged. As people began to develop larger blacklists and share them amongst themselves the need for a more dynamic method or centralized blacklist was developed. The answer to this was what is known as the traditional Remote Black List (RBL) or Domain Name System Black List (DNSBL). A DNSBL, is a means by which an Internet site may publish a list of IP addresses that people may want to avoid, in a format which can be easily queried by computer programs on the Internet. The technology is built on top of the Internet Domain Name System (DNS). DNSBLs are chiefly used to publish lists of addresses associated with spamming. Most mail transport agent (mail server) software can be configured to reject or flag messages which have been sent from a site listed on one or more such lists. RBL originated as an abbreviation for “Real-time Blackhole List”. “RBL” was the trademarked name of the first system to use this strategy, the proprietary MAPS DNSBL.
Developers of mail software have adopted configuration parameters that use “RBLs” or “RBL domains” when any DNSBLs can be used, not just the MAPS RBL. The term “rejectlist” has also been used, as well as Right Hand Side Blacklist (RHSBL), similar to a DNSBL but it listing domain names rather than IP addresses. The term comes from the “right-hand side” of an email address—the part after the @ sign—which clients look up in the RHSBL. Several services manage and maintain a list of domains used by spammers.
Unfortunately, RHSBL cannot address the growth of bots which has resulted in spammers infecting the domains of legitimate email senders and mixing their spam with non-spam from infected domains.
The first DNSBL was created in 1997 by Paul Vixie and Dave Rand as part of the Mail Abuse Prevention System (MAPS). Initially, there was a list of commands that could be used to program routers so that network operators could “blackhole” all TCP/IP traffic for machines used to send spam or host spam supporting services, such as a website. This was a reference to a theoretical physical phenomena whose gravitational force was intense enough to absorb all incident light and emit no information, the ultimate black box of information theory. Vixie, an influential Internet programmer, network administrator and Chief Technology Officer, was able to install these blackhole routines in key routers so that people would not be able to connect to these machines, even if they wanted to. The purpose of the RBL was not simply to block spam-it was to educate Internet service providers and other Internet sites about spam and related problems, such as open SMTP relays, spamvertising, etc. Before an address would be listed on the RBL, volunteers and MAPS staff would attempt repeatedly to contact the persons responsible for it and get its problems corrected. Such effort was considered ethical before blackholing all network traffic, but it also meant that spammers and spam supporting ISPs could intentionally delay being put on the RBL.
Later, the RBL was also released in a DNSBL form and Paul Vixie encouraged the authors of sendmail and other mail software to implement RBL clients. These allowed the mail software to query the RBL and reject mail from listed sites on a per mail server basis instead of blackholing all traffic.
Soon after the advent of the RBL, others started developing their own lists with different policies. One of the first was Alan Brown's Open Relay Behavior-modification System (ORBS). This used automated testing to discover and list mail servers running as open mail relays-exploitable by spammers to carry their spam. ORBS was controversial at the time because many people felt running an open relay was acceptable, and that scanning the Internet for open mail servers could be abusive. In 2003, a number of DNSBLs came under denial-of-service attacks. Since no party has admitted to these attacks nor been discovered responsible, their purpose is a matter of speculation. However, many observers believe the attacks are perpetrated by spammers in order to interfere with the DNSBLs' operation or hound them into shutting down. In August 2003, the firm Osirusoft, an operator of several DNSBLs including one based on the SPEWS data set, shut down its lists after suffering weeks of near-continuous attack.
It is possible to serve a DNSBL using any general-purpose DNS server software. However this is typically inefficient for zones containing large numbers of addresses, particularly DNSBLs which list entire Classless Inter-Domain Routing netblocks. DNSBL-specific software—such as Michael J. Tokarev's rbldnsd, Daniel J. Bernstein's rbldns, or the DNS Blacklist Plug-In for Simple DNS Plus—is faster, uses less memory, and is easier to configure for this purpose.
The hard part of operating a DNSBL is populating it with addresses. DNSBLs intended for public use usually have specific, published policies as to what a listing means, and must be operated accordingly to attain or keep public confidence.
When a mail server receives a connection from a client, and wishes to check that client against a DNSBL (let's say, dnsbl.example.net), it does more or less the following:
Take the client's IP address—say, 192.168.42.23—and reverse the bytes, yielding 23.42.168.192. Append the DNSBL's domain name:
23.42.168.192.dnsbl.example.net.
Look up this name in the DNS as a domain name (“A” record). This will return either an address, indicating that the client is listed; or an “NXDOMAIN” (“No such domain”) code, indicating that the client is not.
Optionally, if the client is listed, look up the name as a text record (“TXT” record). Most DNSBLs publish information about why a client is listed as TXT records.
There is an informal protocol for the addresses returned by DNSBL queries which match. Most DNSBLs return an address in the 127.0.0.0/8 IP loopback network. The address 127.0.0.2 indicates a generic listing. Other addresses in this block may indicate something specific about the listing—that it indicates an open relay, proxy, or spammer-owned host.
Conventional real-time blackhole list (RBL) filtering comprises prepending an IP address to an RBL domain, querying a Domain Name System (dns) server, and receiving a result. That result may be used to take action such as blocking an email received from a certain IP address.
Other proposed solutions shift the burden of establishing credibility onto innocent senders. Examples include adding sender policy framework policies or domainkey Public Keys into the dns TXT fields.
DomainKeys
In DomainKeys, U.S. Pat. No. 6,986,049 assigned to Yahoo!, the receiving SMTP server uses the name of the domain from which mail originated, the string_domainkey, and a selector from the header to perform a DNS lookup. The returned data includes the domain's public key. The receiver can then decrypt the hash value in the header field and at the same time recalculate the hash value for the mail body that was received, from the point immediately following the “DomainKey—Signature:” header. If the two values match, this cryptographically proves that the mail originated at the purported domain and has not been tampered with in transit. DomainKeys is primarily an authentication technology and does not itself filter spam. It also adds to the computational burden of both sender and receiver in encrypting/decrypting and computing/comparing hash values.
Sender Policy Framework
The Sender Policy Framework (SPF) is another emerging standard pertinent to security. Adopting SPF requires the owner of the example.org domain to designate which machines are authorized to send e-mail whose sender e-mail address ends with “@example.org”. Receivers checking SPF can reject messages from unauthorized machines before receiving the body of the message. SPF uses the authority delegation scheme of the Domain Name System. A syntax defines a policy in a domain's DNS records, typically TXT.
A proposal to merge Microsoft Caller ID and SPF was submitted to the IETF MARID working group. Caller ID and SPF aimed to prevent spoofing by confirming what domain a message came from and thereby increase the effectiveness of spam filters. Under the merged proposal, organizations would have published information about their outgoing e-mail servers, such as IP addresses, in the Domain Name System (DNS) using the industry-standard XML format. The converged specification included testing at both the message transport (SMTP) level, or envelope, as originally proposed in SPF, as well as in the message body headers, as originally proposed in Caller ID. Testing for spoofing at the message transport level was suggested to block some spam messages before they are sent. In cases in which a deeper examination of the message contents is required to detect spoofing and phishing attacks, the Caller ID-style header check would apply. However the MARID working group self-terminated without success.
The main benefit of SPF is to people whose e-mail addresses are forged in the Return-Paths. They receive a large mass of undeserved and worrisome error messages and other auto-replies, making it difficult to use e-mail normally. (Am I infected with a virus, did someone access my computer without authorization, shall I change all my passwords?) If such people use SPF to specify their legitimate sending IPs with a FAIL result for all other IPs, then receivers checking SPF can reject forgeries, possibly reducing the amount of back-scatter. This is an indirect benefit and has not been sufficiently motivating to cause adoption.
SPF may offer advantages beyond potentially helping identify unwanted e-mail. In particular, if a sender provides SPF information, then receivers can use SPF PASS results in combination with a white list to identify known reliable senders.
The Sender Policy Framework (SPF) standard specifies a technical method to prevent sender address forgery. Present implementations of the SPF concept protects the envelope sender address, which is used for the delivery of messages.
SPF allows the owner of an Internet domain to use a special format of DNS TXT records to specify which IP addresses are authorized to transmit e-mail for that domain. SPF allows software to identify and reject forged addresses in the SMTP MAIL FROM (Return-Path), a typical nuisance in e-mail spam. SPF is defined in RFC 4408. In using SPF domains identify the machines authorized to send e-mail on their behalf. Domains do this by adding additional records to their existing DNS information. Some examples of policies:                TXT v=spf1 include:spf-a.hotmail.com include:spf-b.hotmail.com include:spf-c.hotmail.com include:spf-d.hotmail.com ˜all        TXT spf2.0/pra ip4:152.163.225.0/24 ip4:205.188.139.0/24        ip4:205.188.144.0/24 ip4:205.188.156.0/23        ip4:205.188.159.0/24 ip4:64.12.136.0/23 ip4:64.12.138.0/24        ip4:64.12.143.99/32 ip4:64.12.143.100/32 ip4:64.12.143.101/32        ptr:mx.aol.com ?all        
The format which has been adopted as a standard has been criticized as awkward. The distributed nature of DNS records could be advantageous if widely adopted but has limited value to early converts. SPF requires widespread adoption to yield results and the cost and degree of effort has gained limited penetration. Early adopters have not achieved enough critical mass to attract the mainstream.
One can see that the SPF solution which requires the publishing of IP addresses from which legitimate email can originate for a DOMAIN could eliminate the forged addresses that spammers use in email. SPF, however, requires that each individual domain owner publish such a list. This requires significant time for each of millions of people to adopt. Publishing SPF policies is complex and prone to error. Many DNS service providers do not support it.
What is Needed is
The blacklist solution is objectionable to legitimate email users sharing the same IP addresses used by spammers and makes RBL lists less than ideal by harming innocent users. Increasingly, spam is emitted from bot networks which consist of computers which have been penetrated by malicious senders. The email sent from a bot may contain a mixture of spam caused by the infection and legitimate mail. Unfortunately putting a bot infected IP address on an RBL punishes the victim more than the criminal. It would be desirable to block only the spam emitted from a bot network.
One can appreciate that a single entity might build such a list using techniques not discussed or disclosed in this application. However, if such a list was available, it would be extremely useful to have it available in real time to anyone who wanted to make a query. This could be accomplished using a database, a webpage, or something based on the domain name system (DNS) by those skilled in the art. It can be appreciated that the existing RBL systems cannot support this list because they can only allow the lookup of a single IP address and because domains sharing an IP addressee are thereby indistinguishable.
Therefore it is one objective of this invention to provide an improved system for looking up domains and IP addresses in an efficient manner.
Thus it can be appreciated that what is needed is an efficient way to query a database from anywhere in the Internet, a high performance cachable storage of data which can reply to such queries, and a better way to look up the IP addresses of legitimate email senders so that their email can easily bypass filters. In more general terms, what is needed is a better way to distinguish legitimate email senders from spammers so that their email is efficiently delivered with less latency and resource consumption.
Summary of the Solution
The present solution has three parts which may operate independently or in combination. A general method for requesting a service or reply such as querying a database is disclosed. A general method of operating a database is disclosed. An application of the query-operation method is disclosed for facilitating the transmission of email.
The invention comprises a method for requesting a service such as querying a remote database on the internet located at a host having a domain name, the method comprising the steps following: appending a suffix containing the host name to a first argument; prepending a second argument as a prefix to the first argument; and sending a dns query to a dns resolver comprising questiontype=A, questionname=the fully qualified domain name, and questionclass=IN wherein prepending and appending includes inserting a delimiter to form a fully qualified domain name. The invention further comprises appending at least one additional argument to the fully qualified domain name. The invention further comprises appending an authentication code as an argument whereby a service such as a database can track and control access.
The invention comprises a method for operating a service such as a database comprising the steps of transmitting an IP address to a sender of a dns query; receiving a fully qualified domain name as the query name in a dns query from a dns client; and determining a first query argument and a second query argument from the fully qualified domain name.
The present invention selects email from legitimate senders and facilitates its transmission to receivers more efficiently while reducing the load on spam scanners. The method comprises: querying a database with a set of email parameters, and transmitting email according to the result of the query. The method further comprises transmitting the set of email parameters as concatenated labels in a string. The method further comprises extracting the email parameters by analyzing a TCP/IP header and an MAIL “FROM” command from an email envelope where the email parameters comprise at least an IP address of a client and a sender which is at least one of a local-part and a domain. In other words, the argument of the MAIL “FROM” command correctly includes <local-part@domain>. The set of email parameters comprises “domain” and “IP address”. It may further comprise “local-part”.
In an embodiment, the query comprises the step of an RBL-style lookup over the domain name system (DNS). However the content of the query is at least the domain of the email sender concatenated to the IP address of the client sending the MAIL ‘FROM” command. The domain or the entire email address is extracted from the argument of the MAIL ‘FROM” command. The method of the invention further comprises continuing the session to transfer the message body only if the reply from the reputation server determines the sender is not a spammer. In one embodiment, the database holds information on senders whose history does not include spam. In another embodiment, the email is transferred to an email filter for further analysis. In an alternate embodiment, the database holds information on senders who have a spam history, causing the email to be blocked. The invention is distinguished from conventional approaches which rely only on IP addresses.
The invention comprises transmitting the set of email parameters (sender domain or sender email address and the IP address of the sending email host) and receiving a status from a database. In an embodiment, concatenating the domain and IP address as labels to a RBL-like query elicits a status from a database.