Typically, organizations possess huge amounts of data related to various aspects of their business, such as employees, business partners, operations, and management, stored in databases of content sources. Data stored in the databases is often used for different purposes, such as testing, training, demonstration, and data research, and may be accessed by people within the organization as well as outside the organization. The data stored in the databases is also accessible as web content over web documents or through other interfaces. The web content represented over the web documents thus contains both sensitive and non-sensitive data. Accordingly, care needs to be taken to ensure that at least the sensitive data is inaccessible to unauthorized people, either from within the organization or from outside. A failure to do so may result in the theft of data or unnecessary disclosure of sensitive information. For example, a sensitive data used for a bank may include customer's data, such as name, account number, credit card number, debit card number, and address of the customers. In many scenarios, to carry out the day-to-day operations of the bank or other organizations, revealing the identity of customers through the customer's data is not acceptable. However, in many situations, such as for training and testing purposes, the customer's data may have to be shared with other employees, even if the employees are not authorized to access the data. This may lead to disclosure of the sensitive data.
In order to overcome the above issues, existing techniques perform masking of the sensitive data in the web document. However, the existing techniques may not be able to perform masking of the sensitive data when there is a change in the sensitive data and/or a change in a structure of the web document.