Software architects often engage in a process of improving software after deployment of the software. The improvements may be implemented by modifying a software system or by creating a new software system (e.g., a replacement system), where the modified or new software system is intended to replace or operate beside the deployed (current) software system. Deployment of the modified or the new software system may have an impact on hardware that supports the software system (e.g., require more or less processing power and/or time), may impact outcomes resulting from user interaction (e.g., satisfy, annoy, or frustrate users, etc.), or may have other possible outcomes (e.g., include bugs, etc.). Therefore, it is desirable to perform a comparison test to compare results following execution of the modified or new software system against results following execution of the deployed software system prior to a full deployment of the modified or new software system. However, comparison tests of deployed software systems with modified or new software systems may result in detection of differences that are unimportant or otherwise not meaningful, for example, random differences.