The proliferation of cloud-based services and platforms continues to increase. The advent of cloud-based shared content storage systems has impacted the way personal and corporate electronically stored information objects (e.g., files, images, videos, etc.) are stored, and has also impacted the way such personal and corporate content is shared and managed. One benefit of using such cloud-based systems is the ability to share secure content (e.g., movie scripts, financial statements, product specifications, etc.) for viewing and/or downloading between trusted viewers. In some cases, however, such secure content can be captured (e.g., downloaded, screen capture, manual copy, etc.) and disclosed (e.g., leaked) in an unauthorized manner. For example, a confidential, unreleased movie script might be available to certain trusted users for download, yet disclosure of the movie script to other individuals outside of the trusted group is prohibited. In some cases, one or more users might violate the restriction and disclose (“leak”) the content to unauthorized parties—either maliciously or unintentionally. Such users or recipients might also modify the content (e.g., change words and/or sentences, crop certain portions, etc.). When the disclosed portion of the content is discovered, the content owners might want to determine the source of the unauthorized disclosure or leak.
Various legacy techniques have been implemented to provide security to documents. Some legacy techniques enable content owners (e.g., enterprises) to insert or overlay a watermark onto a document to visually indicate a level of security (e.g., “CONFIDENTIAL”) and/or to indicate a level of authenticity (e.g., “OFFICIAL COPY”). Such techniques, however, do not provide a way to track the source (e.g., a user, viewer, non-owner, etc.) of unauthorized or illegal dissemination of the content. Further, watermarking documents primarily comprising text have little deterring effect since the cost of defeating the watermark is small relative to the value of the content. For example, a movie script that has significant potential value can be watermarked (e.g., as “Confidential—Do Not Copy”), but can be defeated by simply retyping the text in another document. In other legacy approaches, natural language processing (NLP) is used to make changes to the text of an object, yet without significantly impacting the meaning of the content. Such changes can be known to the object owner, yet not to the object user or viewer, such that the changes can serve as a watermark on the object. Such legacy approaches, however, use fixed algorithms determined by the object owners when applying the text changes, limiting the ability to determine the source of a leak. For example, using such legacy techniques, multiple users might receive the same watermarked text and/or certain users might modify the uniformly watermarked text so as to obfuscate the download source (e.g., the user invoking the download).
The problem to be solved is therefore rooted in technological limitations of legacy approaches. Improved techniques, in particular improved application of technology, are needed to address the problem of deterring unauthorized disclosure of electronically shared content that is readily modified and/or copied (e.g., text). More specifically, the technologies applied in the aforementioned legacy approaches fail to achieve sought-after capabilities of the herein disclosed techniques for securing shared documents using dynamic natural language steganography in a manner that facilitates identification of the disclosing party. What is needed is a technique or techniques to improve the applicability and efficacy of various technologies as compared with legacy approaches.