Next-generation sequencing (NGS) technology has provided a powerful tool to produce a gigantic amount of biological data that will shed light on path towards personalized medicine. While the cost of high throughput genome sequencing is decreasing in terms of merely acquiring sequence data, the analysis and interpretation of these large-scale sequencing data remains to pose a major challenge. To call variants from NGS data, many aligners and variant callers have been developed and composed into diverse pipelines. A typical pipeline contains an aligner and a variant caller: the former maps the sequencing reads to a reference genome, and the latter identifies variant sites and assigns a genotype to the subjects. In going through the pipeline, users often need to set many parameters in order to properly analyze the sequencing data. Importantly, some parameters need to be optimized for accurately calling the variant, e.g., on the basis of the type of cells or the ethnic groups from which the sample is prepared. However, due to the enormous computation required for each run of the pipeline, going through the entire variant call pipeline to test each parameter setting is practically infeasible. Therefore, there is continuing need to develop new methods and systems to optimize parameter settings for analyzing NGS data.