As a method for eliminating spam mail, a mail server for distributing e-mails to recipients stores determination information, for example, keyword, address of a sender, URL etc., for judging whether or not it is spam mail and then the received e-mail is classified into spam mail to discard and not to deliver it to an user if the received e-mail include the determination information. Also, a user can set a filter rule by oneself into a user's terminal to filer e-mails including a specific address or keyword as a spam mail.
For example, Japanese patent No. 5121828 (corresponding to U.S. Patent Pub. No. 2010/161748) discloses an e-mail processing apparatus. According to this patent, the e-mail processing apparatus extracts appearance information (for example, the number of e-mail lines, attached file, e-mail format, and language in e-mail etc.) as outline information indicating appearance features of e-mail excepting for a body of e-mail, a sender and its address, and a recipient and its address, requests an external management center to send spam detection information for detecting spam mail based on the extracted outline information, decides that the e-mail is spam mail if content of the e-mail matches spam detection information, and requests the external management center to send updated spam detection information by sending the outline information if the e-mail is not decided to be spam mail.
Also, Japanese patent publication No. 2011-90442 disclosed is an e-mail classification apparatus for reducing a processing load for eliminating spam mails and manipulation load by users. According to this publication, the e-mail classification apparatus retrieves a feature vector indicating a feature of e-mail based on header information of the e-mail to create a classification rule for classifying whether or not it is spam mail using the feature vector as learning data.
The conventional methods for detecting and classifying spam mail have been studied, however, contents of spam mails are changed from day to day and they are sent from unspecified terminals connected to networks in large quantity and indiscriminately, thus it is difficult to exclude such spam mails completely and in real time. On the other hand, to improve accuracy of detection and classification of spam mail, it is required to update the determination information quickly by processing lots of spam mails to extract the determination information for judging spam mail from them. Therefore, a method for extracting the determination information for judging spam mail speedy and accurately is desired. Furthermore, it is also desired to retrieve information concerning spam mail sender resource to utilize it for the determination information.