Today, virtually every municipality, agency, educational institution, mass transportation center, financial institution, utility plant and medical center uses video surveillance to protect property, employees, customers, citizens and information technology (IT) infrastructure. Likewise, visual monitoring systems, in general, are increasingly prolific, tracking the behavior or processes, such as nuclear power plants, babies in intensive care, and remote tele-operation environments. However, the performance and effectiveness of video surveillance systems may vary widely depending on a number of factors. Current methods for evaluating surveillance systems typically focus on the performance of individual alert algorithms, measuring, for example, their precision and recall. Measuring individual alert algorithms does not provide adequate insight into the effectiveness of these alerts on human detection, recognition, and performance. One prior art alternative to pure algorithm-measurement is to measure the effectiveness of the whole system, including the human monitor, using a panel of human judges. Even with a panel of judges, however, no methodology exists to develop a set of conditions that would serve to test the system.