Many security software providers attempt to combat the growing tide of malicious software with machine learning models trained to detect malware. Conventional malware detection models may be optimized based strictly on efficiency considerations, which tend to result in very large models. Large models may be difficult to ship based on bandwidth restrictions, causing significant delays. Such delays may be especially pronounced for enterprise clients. Additionally, the computation time and cost of malware detection models is often a function of size. As such, smaller models may yield better performance on client machines.
Some malware detection models, such as neural networks, may respond to compression methods that reduce the size of models after the models have been created and trained. However, compression methods may not be available for malware detection models, such as tree-based classifiers, that may lack the weights, vectors, and matrixes exploited in traditional compression methods. Another method may reduce the size of malware detection models, after the model's hyperparameters have been selected, by modifying a model's size-determinant hyperparameters, while holding all other hyperparameters constant. However, modifying size-determinant hyperparameters, while holding constant hyperparameters that do not influence model size, may lead to results that are suboptimal given a model's constraints.
Accordingly, the instant disclosure identifies and addresses a need for improved systems and methods for malware remediation that uses malware detection models that balance preferences for size and efficiency.