The proliferation of personal computing devices in recent years, especially mobile personal computing devices, combined with a growth in the number of widely-used communications formats (e.g., text, voice, video, image) and protocols (e.g., SMTP, IMAP/POP, SMS/MMS, XMPP, YMSG, etc.) has led to a communications experience that many users find fragmented and difficult to search for relevant information in. Users desire a system that will provide for ease of message threading by “stitching” together related communications across multiple formats and protocols—all seamlessly from the user's perspective. Such stitching together of communications across multiple formats and protocols may occur, e.g., by: 1) direct user action in a centralized communications application (e.g., by a user clicking ‘Reply’ on a particular message); 2) using semantic matching (or other search-style message association techniques); 3) element-matching (e.g., matching on subject lines or senders/recipients/similar quoted text, etc.); and/or 4) “state-matching” (e.g., associating messages if they are specifically tagged as being related to another message, sender, etc. by a third-party service, e.g., a webmail provider or Instant Messaging (IM) service). These techniques may be employed in order to provide a more relevant “search-based threading” experience for users.
With current communications technologies, conversations remain “siloed” within particular communication formats or protocols, leading to users being unable to search uniformly across multiple communications in multiple formats or protocols and across multiple applications and across multiple other computing devices from their computing devices to find relevant communications (or even communications that a messaging system may predict to be relevant), often resulting in inefficient communication workflows—and even lost business or personal opportunities. For example, a conversation between two people may begin over text messages (e.g., SMS) and then transition to email. When such a transition happens, the entire conversation can no longer be tracked, reviewed, searched, or archived by a single source since it had ‘crossed over’ protocols. For example, if the user ran a search on their email search system for a particular topic that had come up only in the user's SMS conversations, even when pertaining to the same subject manner and “conversation” such a search may not turn up optimally relevant results.
Further, a multi-format, multi-protocol, communication threading system, such as is disclosed herein, may also provide for the semantic analysis of conversations. For example, for a given set of communications between two users, there may be only a dozen or so keywords that are relevant and related to the subject matter of the communications, as determined by one or a number of associated algorithms designed to detect keyword importance. These dozen or so keywords may be used to generate an “initial tag cloud” to associate with the communication(s) being indexed. The initial tag cloud can be created based on multiple factors, such as the uniqueness of the word, the number of times a word is repeated, phrase detection, etc. These initial tag clouds may then themselves be used to generate further an expanded “predictive tag cloud,” based on the use of Markov chains or other predictive analytics based on established language theory techniques and data derived from existing communications data in a centralized communications server, including unique data derived from the communication patterns of one and/or multiple users utilizing the centralized communications server when interacting with one and/or multiple other users and non-users of the centralized communications server. These initial tag clouds and predictive tag clouds may be used to improve message indexing and provide enhanced relevancy in search results. In doing so, the centralized communications server may establish connections between individual messages that were sent/received using one or multiple communication formats or protocols and that may contain information relevant to the user's initial search query.
The subject matter of the present disclosure is directed to overcoming, or at least reducing the effects of, one or more of the problems set forth above. To address these and other issues, techniques that enable seamless, multi-format, multi-protocol communication threading are described herein.