1. Field of Disclosure
This invention relates to facilitating file activity monitoring and file access control through performance of on-demand file content classification using an out-of-band communications channel.
2. Description of the Related Art
In modern computing, data files are often remotely stored on various file servers accessible over a variety of different networks. User devices retrieve such files by sending access requests over the networks to the file servers, and responsively receiving the requested files. In certain instances, those files stored by the file servers include sensitive types of data, such as financial records, health records, personal information, etc. In order to manage access to such sensitive files, many organizations employ various systems for monitoring and controlling access to files stored by their file servers.
Many current systems employ various file security and/or audit policies to monitor and control access to files. Such policies often are associated with various rules requiring classification of the contents of the files, and the performance of specific actions responsive to the classification. For example, a particular audit rule may specify that a system record an observed access to a file responsive to a determination that the contents of the file include certain types or classes of sensitive data. Current systems take one of two approaches in classifying the contents of files: exhaustive pre-classification of all the files on a file server, or on-the-fly classification performed during file access.
In the former approach, a system performs lengthy and complex classifications of the contents of each file stored by a file server. As a consequence of such an approach, the system is incapable of using rules requiring file content classifications until such processing is completed, which may take many months. Furthermore, during periods of classification, server and network loads are dramatically increased. Moreover, as files and/or classification decision rules are updated and/or added, classification of at least some of the files must be repeated. Such repeated processing prolongs increased file server and network loads. Further, the performance of file activity monitoring and file access control may be sluggish as large databases must be searched in order to identify a file's classification information.
In the latter approach, a system performs a classification each time a file requiring classification is accessed through the network. In the approach, the system obtains the file to be classified by reading the file as the file is transferred over a communications channel between a user device and file server. One problem with such an approach is that, at times, only a portion of a file may be transferred. Thus, the system is not always able to perform classification on the entirety of the file. As a result, the file may be inaccurately classified, which affects the system's ability to correctly apply security and/or audit rules. Furthermore, some user devices perform multiple accesses to a file even in a single application transaction (e.g., a read to a document for a word processing application executed on a user device). As a result, performing classifications each time a file is accessed can incur large computational burdens on the system. Moreover, because content classification is performed on-the-fly for each file access, significant delays in applying security and/or audit rules may be introduced. For example, a file may be accessed numerous times before the system is able to determine that access to the file should be restricted.