Entities often generate and use data that is important in some way to their operations. This data can include, for example, business data, financial data, and personnel data. If this data were lost or compromised, the entity may realize significant adverse financial and other consequences. Accordingly, many entities have chosen to back up some or all of their data so that in the event of a natural disaster, unauthorized access, or other events, the entity can recover any data that was lost or compromised, and then restore that data to one or more locations, machines and/or environments.
Many backups often resided on-premises at the client or user. Because backup and restore operations are generally only performed within the confines of the enterprise, rather than over the wire to and from a cloud storage service for example, these on-premises backups were typically not encrypted.
With the advent of cloud and other remote storage systems however, concerns have arisen about the vulnerability of stored data, and the vulnerability of data as it is transmitted to and from the cloud storage system. Thus, many enterprises that have moved from on-premises storage to cloud storage now require their stored and transmitted data to be encrypted.
In recognition of the threat posed to data security by unauthorized personnel such as hackers, many entities have taken steps to encrypt their data backups. There are a variety of encryption solutions available, some of which are quite robust, and others of which are considerably less so. Some of the less robust approaches to encryption may involve, for example, word level encryption in which a token is generated for each word in a data backup. This approach provides a relatively low degree of protection however because, for a given word, the range of possible decryption solutions is relatively small and, accordingly, it may not take long for an unauthorized person and/or system to exhaust the possibilities and arrive at the solution.
Moreover, once a word is decrypted, it is a relatively simple matter for an unauthorized user and/or system to search for the same token in the data, and then correlate any tokens found with the known word. This process can be easily repeated for each decrypted word, and complete decryption may be attained relatively quickly and easily. Such unauthorized access is made even easier by the inclusion, in some encrypted data, of positional information that indicates where a particular word appears relative to another word, or words.
In light of concerns such as these, more robust approaches to encryption have been developed that may involve block encryption of large chunks of data in which large numbers of words are encrypted together, so as to produce a large mass of encrypted data. This type of encryption provides a high level of security because it produces a large mass of what appears to be random data. Thus, it may be nearly impossible for an unauthorized person and/or system to even search the encrypted data, much less parse out, and then decrypt, individual words from the mass of encrypted data.
Although encryption has proven useful in the context of data security, its use has introduced some difficulties, one of which relates to the ability to search the encrypted data. In particular, some entities would like to be able to search their encrypted data while, at the same time, preserving an effective level of data security. However, typical search indexes are only effective when employed with unencrypted data. Thus, while data encryption can provide a significant measure of security, it also makes searching of the encrypted data relatively difficult. Moreover, it is generally the case that the ability to search data is degraded in tandem with increasing robustness of encryption.
Thus, entities that use cloud storage and/or other remote storage solutions are presented with a dilemma. In particular, a relatively high degree of data security can be obtained, but at the cost of losing the ability to effectively search the encrypted data. On the other hand, it is possible to attain a relatively high degree of effectiveness in searching encrypted data, but only so long as the encryption of that data is relatively weak.
In light of problems such as those noted above, it would be useful to provide systems, methods and devices capable of providing an acceptable level of encryption while also enabling effective searching of the encrypted data. It would also be useful to provide an encryption solution and search index that do not use positional information of the data that is encrypted. As well, it would also be useful to provide a hybrid encryption solution that is neither a word level encryption nor a block encryption. Further, it would be useful to provide an encrypted search index that enables effective searching of encrypted data.