The present invention generally relates to software applications and, more particularly, to a software application for identifying, classifying, and extracting hidden or embedded entities in a data file.
As is well known in the art, it is possible to hide or embed data in documents of various types. For example, U.S. Pat. No. 5,822,436 discloses a machine-readable marking provided on emulsion films, photographic papers, and the like. The marking encodes digital information, yet is essentially imperceptible to the human eye. Additionally, U.S. Pat. No. 6,289,108 discloses providing a photograph with supplemental data. This supplemental data is below a threshold of human perception (e.g., is essentially invisible) yet can extend throughout the image.
Furthermore, during the preparation of data files, data may be hidden in such files, whether deliberately or inadvertently. Hidden data includes data within an application or an application data file that may not be visible by normal viewing of the data within the application. For example, during the preparation of a PowerPoint® presentation, text may be included in a slide where the color of the font matches the fill color of the text box.
Such hidden data raises security concerns should the data file reach an audience other than an intended audience. This is particularly so in the case of proprietary or classified information hidden in data files.
As can be seen, there is a need for a system and method for identifying, classifying, and extracting hidden information from data files. Such a system and method preferably includes a means for resolving hidden information issues identified.