Conventional approaches to anomaly detection may use validation data in two forms. First, researchers have used real attack data to attempt to validate their approaches. This may be the best approach when such data is available, but typically, network operators have very little or no real attack data for their networks. Real attacks tend to be very rare. If such data is available, the generation of attack data may allow researchers to vary the parameters of the generation scheme and to test their methods on a variety of attacks. As such, researchers can usually only make substantiated claims about the specific attacks for which they have data.
Second, researchers have simulated both non-attack data and attack data. Many of these simulations are generated from models whose parameters are estimated from real networks. This has the advantage of allowing flexibility in the distributions of both nominal and attack data, which allows researchers to test methods against a variety of different scenarios. This method, however, suffers from the assumptions of the models generating the data. In short, researchers are not using real data, and many differences exist between real data and simulated data. Accordingly, an improved method to simulate attacks may be beneficial.