A variety of automated software applications or services often have a need to recognize a variety of information contained within a stream of characters or a string. For example, password analysis programs may want to prohibit commonly used words from being present in user-defined passwords. Word processors may also like to detect when a user forgot to put a space between words being used to assist or automatically correct the situation for the user. Even automated services that take command inputs that are delineated by special characters may want to automatically detect when a user has missed a delineator in a command input string.
In the password analysis scenario, these programs often do little more than attempt to match the password with words in a dictionary and maybe also consider a few acceptable mutations, such as replacing an “a” character with an “@” character. But, a user can add a few characters in the front, back, or even in the middle of a password and essentially render the analysis program useless. Still further, by just concatenating two words together in a password, the user can achieve something the password analysis programs are designed to prevent.
The present techniques used in the industry attempt to accomplish word recognition within strings by breaking a given string into all its possible permutations or combinations; each permutation is then compared against words housed in flat databases. However, long strings are extremely problematic and take a heavy toll on processor and memory resources. With conventional approaches, if the length of the string being processed is N then the amount of substring permutations and database lookups is N2. It is apparent that for any decent sized string this approach is resource prohibitive.
Thus, what is needed is an improved mechanism for recognizing patterns within a string.