Often, digital documents transmitted over networks or stored in various forms of computer storage contain data that should be protected from reading by an unauthorized reader. Further, due to the requirements of particular software architectures, various techniques protect the data without breaking the algorithms intended to work with the unmodified documents. A common approach is to replace the protected piece of data with a token—a string that resembles the original data but prevents the unauthorized reader access to the original text. Thus, tokenization solutions provide the means of encoding documents by replacing the protected data with tokens and subsequently reversing the process.
Most tokenization solutions utilize a secure vault or database to hold an encrypted copy of the original plaintext (i.e., clear-text) and the associated token for reverse mapping during the decoding phase of the tokenization solution. For example, the token may be a random value that must also conform to specific requirements, such as conforming to a sixteen digit credit card number including a checksum (e.g., a Luhn 10 checksum). In many solutions, the secure token database is a dynamic entity or structure that “grows” over time as new plaintext-token mappings are generated. It should be appreciated that solutions utilizing such a token vault have significant performance, data consistency, resource, and management challenges as the number of tokens increases within a cluster of machines and/or across clusters of geographically distributed data centers needed to meet high application availability, throughput, and latency requirements.