The popularity of the Internet and the availability of nearly-unlimited data storage capacity have caused large amounts of data to be generated. Within the vast amounts of data, much valuable knowledge and information may be available if it can be located, for example, by computer-implemented file classification techniques used to categorize unknown data files.