The popularity of the Internet and the availability of nearly-unlimited data storage capacity have caused large amounts of data to be generated. Within the vast amounts of data, much valuable knowledge and information may be available, if it can be located, for example, by computer-implemented statistical and data mining techniques to locate and categorize unknown data files.