The exemplary embodiment relates to electronic mail messages and finds particular application in connection with a system and method for detection of missing attachments.
When sending electronic mail messages (emails), the sender has the opportunity to attach one or more attachments to the message. The attachments can be documents, other email messages, and the like. In the body of the email, the sender may make a textual reference to the attachments. The email and its attachments are sent to a designated recipient. A problem arises in that an email is sometimes sent before the attachments have been attached to the email. Current email applications may therefore include a missing attachment detector that warns the user writing the email that an intended attachment may have been omitted. The detector looks for a given set of keywords (such as “attached,” “document,” and the like) in the body of the email. Based on the occurrence of such words, the detector determines that the sender may have forgotten to add the attachment.
This is a useful functionality to have as it can save the embarrassment of being asked for the attachments by the recipient or more serious consequences, for example, when the attachment is due by a predetermined date. However some problems can be identified, which reduce the usefulness of such a detector. One problem is that the triggering set of keywords may have to be defined explicitly within a list by the user. This may involve entering all inflected forms of the keywords. Because of the lack of morphological inflections in English, current English language attachment detectors can use a fixed set of keyword patterns and this approach is relatively satisfactory. For example, the words “attach” and “attached” may be sufficient, in a keywords list, to cover commonly-used expressions of the verb “to attach.” The same approach for languages which are morphologically richer than English could multiply the number of entries. For example, in the case of the verb “joindre” in French (which partly corresponds to the English verb “attach”), five keywords would be needed to obtain the same coverage: “joins” (je joins), “joint” (j'ai joint), “jointe” (la pièce jointe), “joints” (les documents joints), and “jointes” (les pieces jointes). For languages with even richer morphological systems, even more encoding could be needed.
Another problem is that the user may have occasion to write emails in different languages. While the user could enter keywords in each language used, this may cause ambiguity problems, where a word is indicative of an attachment in one language but the same word in another language is not. For example, the word “joint” may indicate an attachment in French but would not in English.
Moreover, in some cases, simple keyword detection is not sufficient to detect the sender's intent to attach a document. For example, the user may type in English: “I am very much attached to my wife,” which could trigger an incorrect warning because attached is not referring to an attachment to the email. In French, similar problems could arise in the use of the word “attaché” (attached/endeavored).
The consequences of these problems are both noise (unwanted warnings) and silence (omitted helpful warnings) by the detector.