User computing devices such as, for example, smart phones, tablets, and/or other mobile computing devices continue to become increasingly: (a) ubiquitous; (b) computationally powerful; (c) endowed with significant local storage; and (d) privy to potentially sensitive data about users, their actions, and their environments. In addition, applications delivered on mobile devices are also increasingly data-driven. For example, in many scenarios, data that is collected from user computing devices is used to train and evaluate new machine-learned models, personalize features, and compute metrics to assess product quality.
Many of these tasks have traditionally been performed centrally (e.g., by a server computing device). In particular, in some scenarios, data can be uploaded from user computing devices to the server computing device. The server computing device can train various machine-learned models on the centrally collected data and then evaluate the trained models. The trained models can be used by the server computing device or can be downloaded to user computing devices for use at the user computing device. In addition, in some scenarios, personalizable features can be delivered from the server computing device. Likewise, the server computing device can compute metrics across users on centrally logged data for quality assessment.
However, frequently it is not known exactly how data will be useful in the future and, particularly, which data will be useful. Thus, without a sufficient history of logged data, certain machine-learned models or other data-driven applications may not be realizable. Stated differently, if a certain set or type of data that is needed to train a model, personalize a feature, or compute a metric of interest was not logged, then even after determining that a certain kind of data is useful and should be logged, there is still a significant wait time until enough data to be generated to enable such training, personalization, or computation.
One possible response to this problem would be to log any and all data centrally. However, this response comes with its own drawbacks. In particular, users use their mobile devices for all manner of privacy-sensitive activities. Mobile devices are also increasingly sensor-rich, which can result in giving the device access to further privacy-sensitive data streams from their surroundings. Thus, privacy considerations suggest that logging should be done prudently—rather than wholesale—to minimize the privacy risks to the user.
Beyond privacy, the data streams these devices can produce are becoming increasingly high bandwidth. Thus, in many cases it is simply infeasible to stream any and all user data to a centralized database, even if doing so was desirable.