Past solutions have failed to provide a mechanism for adequately evaluating usability and functionality of computer applications prior to deployment in a live setting. Accordingly, applications are frequently released with functional and design flaws. Prior attempts to address this problem have included, for example, individual reviews. However, an individual's review of an application fails to account for contextual factors that influence a review. Additionally, an individual review commonly lacks details regarding the reviewer's perspective, such as culturally influenced interpretations and other ambiguities. Further, individual reviews have not been scalable, in that they only capture one individual's opinion regarding the quality of the application. Other attempts have included usability testing, which is costly and time-consuming, and usage analytics, which typically only provide passively tracked information.