1. Field of the Invention
The present invention generally relates to a method and apparatus for the prioritization of E-mail messages, and more particularly to a method and apparatus for a multi-tiered approach to the prioritization of E-mail messages.
2. Description of the Related Art
Given the large number of messages that are received each day by knowledge workers and the amount of time required to read and respond to each message, knowledge workers often seek to optimize the time spent on message processing by scanning their inbox, checking sender names and subjects in order to prioritize some messages for attention over others. When the number of new messages in a knowledge worker's inbox is large, sifting through the messages to identify high-priority messages quickly becomes a non-trivial and time-consuming task by itself. This non-trivial and time-consuming task results in a daily feeling of “email overload” and occasionally results in the unfortunate result of overlooking key messages since people find it difficult to create an efficient order when sorting based on elements such as sender, subject, or date.
It is generally understood that the action that a user takes on a message, e.g., read, reply, file or delete, largely depends on the user-perceived importance of the message. The main goal of email prioritization is thus to identify email messages with a high value of user-perceived importance.
There have been several proposed or suggested techniques for redesigning email interfaces to help users quickly identify important emails in their inbox. For example, existing approaches mostly prioritize emails based on a classifier that is trained using supervised learning algorithms.
For example, some conventional approaches automatically group emails into conversational threads and prioritizes messages based on linear logistic regression models with a variety of social, content, thread, and label features to prioritize users' incoming messages. Other conventional approaches use Support Vector Machine (SVM) classifiers, over word-based, phrase-based, and meta-level features, e.g., message sender, recipients, length, time, presence of attachments, to determine the importance of new unread emails. Still other conventional approaches use SVM classifiers, but with additional social importance features computed based on each user's personal social network derived from email data. The content-based features used by these approaches for classifier learning are words that occur in email content, which may not work well for very brief messages with too few words (sparse data) or long messages with too many words (noisy data).
For instance, conventional technologies train their classifier by looking at all of the words within the body of a message. This approach results in a highly dimensional classification, because each word is a dimension. Some conventional classifiers use this highly dimensional approach and then try to infer the importance of the message by calculating the number of instances that a particular word or words appear, while other conventional classifiers attempt to predict the importance of a message based on the location of one word relative to the location of another word. These approaches are very noisy due to their highly dimensional nature. As a result, it is very difficult for a user to ascertain why seemingly similar messages are classified differently by systems that employ conventional approaches.
To increase the accuracy of the prioritization, some conventional approaches train a classifier through one-time batch processing of labeled training data and either do not consider dynamic user feedback, or simply use user feedback to incrementally update the feature weights of the classifier. For example, in conventional technologies that provide for user feedback, the feedback is merely folded into the classifier, which simply adjusts the existing weight of the classifier. However, since the classifier is only updated for each specific feedback instance, it is possible that this feedback is not reflected instantly in the classifier, e.g., even after a user indicates that a message from a sender is low priority, he may still get messages from that sender marked as high priority. In other words, it may take time for the weight of the classifier to be updated in a meaningful manner, e.g., in a manner that would cause the system to change the predicted priority of the message.
Furthermore, aggressively updating feature weights based on user feedback reduces the robustness of email prioritization, e.g., sacrifices the reliability provided by the classifier, while conservatively updating feature weights results in a slow response to user feedback.
Accordingly, the present inventors have recognized a need for improved email systems and methods that assist the user in his/her daily triage of incoming messages by quickly incorporating user-specific criteria for determining the priority of a received email message without sacrificing the reliability provided by the global (general) classifier.