Intelligent automated assistants (or digital assistants) can provide a beneficial interface between human users and electronic devices. Such assistants can allow users to interact with devices or systems using natural language in spoken and/or text forms. For example, a user can provide a speech input containing a user request to a digital assistant operating on an electronic device. The digital assistant can interpret the user's intent from the speech input and operationalize the user's intent into tasks. The tasks can then be performed by executing one or more services of the electronic device, and a relevant output responsive to the user request can be returned to the user.
Digital assistants can utilize various statistical systems for processing and responding to user requests. For example, digital assistants can utilize speech recognition systems, machine translation systems, natural language understanding systems, and speech synthesis systems. The accuracy and robustness of these statistical systems can be enhanced through personalization of the systems. In particular, the underlying statistical models utilized by the statistical systems can be tailored towards a specific user by training the statistical models with user data. For example, text input received from a user can be used to generate a personalized language model for a speech recognition system. This can enable the speech recognition system to better recognize unique words or phrases (e.g., specific names or locations) that may be less common in general speech, but frequently used by the user.
Personalizing statistical systems with user data can, however, raise privacy concerns. For example, users may not want their personal attributes or characteristics reflected in the personalized statistical models to be shared with a third party. One solution for preserving the user's privacy can be to embed the personalized statistical systems on the user's device. In particular, the personalized statistical models can be generated and stored on the user's device. Further, the personalized results obtained from the personalized statistical models can remain on the user's device. Third-party access to the user's personal data can thus be restricted, which can preserve the user's privacy. However, such restricted access can make it difficult to evaluate embedded personalized statistical systems. For example, it can be difficult to tune the underlying models and algorithms of the embedded personalized statistical systems for optimal performance when access the embedded personalized statistical systems is restricted.