It is becoming more and more relevant and valuable to ensure that a file that is installed on a computing system, especially an executable or a resource used by an executable, is verified before being executed or loaded into memory. For example, this prevents malware such as viruses from being executed, as well as preventing the running of applications that an IT administrator desires not be run. Verification ensures that only programs allowed to run on a machine are executed, thereby making the platform much more secure.
One of the ways of achieving verification is file hashing, in which the contents of the file are hashed using a secure hash by the developer of the software, or by a component on the operating system at install time. The hash is stored in a tamperproof place. When the file is executed or loaded, the file contents are again hashed and compared to the stored hash for that file (or set of hashes for allowed files). If the computed hash value does not match an allowed hash, the file has changed (e.g., was tampered with) in some way, or is one that is not allowed to run, whereby loading into memory and thus execution can be prevented. Note that another method of verification is to ensure that code needs to be signed, e.g., by X, Y or Z, whereby a step of the verification process is hashing (producing H), and a cryptographic signature (private key of X applied to H) attests that X signed that code.
Heavy consumers of file hashing (such as for verification when hash rules are used) can have a significantly negative performance impact on a system that is noticeable to users. This is because any executable that is loaded, including dynamically loaded (link) libraries (dlls), needs to be hashed, with the computed hash looked up into a list of allowed hashes.
Computing the hash is costly in a number of ways, including that most if not all of a file needs to be read from the disk for it to be hashed. Further, the content needs to be loaded into memory and this adds to memory paging pressure. Once loaded, the actual hashing of the memory content is also computationally intensive. Furthermore, the hashing operation can occur repeatedly for a given binary, since that binary may be loaded in many processes, (e.g., ntdll.dll, kernel32.dll, advapi32.dll, user32.dll, shell32.dll in the Microsoft® Windows® operating system).