1. Field of the Invention
The present invention relates to techniques for detecting sensitive content in a document. More specifically, the present invention relates to a method and apparatus for detecting sensitive content in a document based on results from an Internet search engine.
2. Related Art
Identifying sensitive content in a document can be an arduous task. Content managers are often responsible for analyzing the document to determine which portions are associated with sensitive information. In doing so, these content managers face the challenge of understanding the types of content in the document that are associated with a given sensitive topic, and the types of content associated with public information.
To make matters worse, content managers have few tools at their disposal for determining the sensitivity of content. The tools currently in existence tend to require a significant amount of training data and user input in order to recognize sensitive topics. Content managers often do not have sufficient training data to make use of these tools. Therefore, identifying sensitive content in a document often involves laborious and error-prone human review.