This specification relates to training machine learning models.
Machine learning models can be trained using a stochastic gradient descent procedure. In stochastic gradient descent, a machine learning model training system operates iteratively to determine values of the model parameters by finding a minimum of an objective function of parameters of the model.