Next-generation sequencing enables researchers to obtain large amounts of data at a reduced cost and thus provides a tremendous opportunity to genotype an individual of any species in depth (Lai et al. “Genome-wide patterns of genetic variation among elite maize inbred lines” 2010, Nat Genet 42: 1027-1030). Recently, several genotyping-by-sequencing (GBS) approaches were developed to genotype hundreds of individuals simultaneously (Andolfatto et al. “Multiplexed shotgun genotyping for rapid and efficient genetic mapping” 2011, Genome Res 21: 610-17; Baird et al. “Rapid SNP discovery and genetic mapping using sequenced RAD markers” 2008, PLoS One 3: e3376; Elshire et al. “A Robust, Simple Genotyping-by-Sequencing (GBS) Approach for High Diversity Species” 2011 PLoS One 6: e19379).
Conventional genotyping is most often conducted using pre-defined SNP markers that must be discovered and validated in advance; these markers are often population-specific. These SNPs are typically detected via hybridization or by individual SNP-specific PCR-based assays. In contrast, GBS technology enables the detection of a wider range of polymorphisms than PCR-based assays (e.g., SNPs plus small insertions and/or deletions, e.g., “indels”). GBS technology eliminates the need to pre-discover and validate polymorphisms. Hence, GBS can be used in any polymorphic species and any segregating population.
However, conventional GBS methods share at least two drawbacks. First, conventional methods use double-stranded adaptors and, consequently, associated methods require stringent control of the template:adaptor concentration ratio in the adaptor ligation. As a result, precisely quantified, high quality input DNA is required as a starting material (see, e.g., Elshire et al.). Second, these methods survey hundreds of thousands or more sites and thus require numerous sequencing reads to generate enough coverage for each site in each sample.