Searching in email is a routine activity for many people. Search queries frequently contain spelling mistakes, which frustrates users and decreases productivity. Users often are forced to reformulate multiple queries to obtain the object of the search. However, there have been few efforts to address this problem in email search. Most email clients do not provide spelling corrections of queries. In contrast to email, most web search providers offer spelling correction for misspelled web-search queries by employing machine learning based strategies based on the logs of session data produced and made available by millions of web users.
Web-query logs are expansive, which allows machine learning strategies to provide corrections for most misspelled queries due to the sheer amount of data available, i.e., by employing “the intelligence of the crowd” to generate spelling corrections and recover the query the user intended when entering the misspelled web query. In addition, for web search, the target document collection is the same for all users—all of the documents of the web. In contrast to web-queries, search history for email is scarce and the target document collection for searching in email is limited and generally private. Thus, traditional spelling correction algorithms, such as those used for web-query, employ machine learning strategies, which are not effective on such sparse and personal data.