Content discovery is a common user task for many computing devices. For example, when a user is performing research or drafting a document, the user may wish to reference relevant information from external web sites or other content sources. In typical systems, the user manually provides one or more search terms to a search engine and then evaluates the search result. Typically, the user also must manually synchronize or otherwise associate search results with the relevant document content. Additionally, many word processing systems do not include a search feature, and thus the user typically uses an external application such as a web browser.
Key phrase extraction is a process used to reduce a text to short phrases, sentences, or other sequences of words which represent the most important parts of that text. Typical key phrase extraction algorithms syntactically analyze the text to produce a list of key phrases. For example, key phrase extraction algorithms may tokenize the input text, assign parts of speech to the tokens, and combine the tokens into key phrases using the assigned part-of-speech tags. A named entity recognition (NER) algorithm may assign additional weight to candidate key phrases that match entries in a dictionary of known noun phrases. The TextRank algorithm constructs and analyzes a graph based on the input text to extract key phrases.