Behavioral targeting is intrinsically a large-scale machine learning problem from the following perspectives. First, to fit a BT predictive model with low generalization error and a desired level of statistical confidence requires vast amounts of data; experiments have shown that the prediction accuracy increases monotonically as the data size increases up to the entire user data of Yahoo!®. Second, the dimensionality of feature space for BT model is very high, for example, ranging from several hundred thousand to several million. Third, the number of BT models to be built is large. For a company like Yahoo!®, there may be over 450 BT-category models for browser and login cookies (a.k.a., b-cookie and I-cookie, respectively) that need to be trained on a regular basis. Furthermore, the solution to training BT models has to be very efficient, because (1) user interests and behavioral patterns change over time and (2) cookies and features (e.g., ads and pages) are volatile objects. Fourth, scientific experimentation and technical breakthrough in BT requires a scalable and flexible platform to enable a high speed of innovation.