Computer learning models can process large volumes of user and item interaction data (such as data reflective of user interactions with an electronic catalog of items) to provide relevant recommendations for users. For example, a model may be implemented as an artificial neural network. Artificial neural networks are artificial in the sense that they are computational entities, analogous to biological neural networks in animals, but implemented by computing devices. A neural network typically comprises an input layer, one or more hidden layers and an output layer. The nodes in each layer connect to nodes in the subsequent layer and the strengths of these interconnections are typically learnt from data during the training process. In recommendation systems, such as systems designed to recommend items (e.g., goods and/or services) to users based on the purchase or acquisition histories of the users, neural network models may generate probability scores indicating the probabilities of a user purchasing or otherwise acquiring items during a time period.
The parameters of a neural network can be set in a process referred to as training. For example, a neural network can be trained using training data that includes input data and the correct or preferred output of the model for the corresponding input data. Sets of individual input vectors (“mini-batches”) may be processed at the same time by using an input matrix instead of a single input vector, which may speed up training. The neural network can repeatedly process the input data, and the parameters (e.g., the weight matrices) of the neural network can be modified in what amounts to a trial-and-error process until the model produces (or “converges” on) the correct or preferred output. The modification of weight values may be performed through a process referred to as “back propagation.” Back propagation includes determining the difference between the expected model output and the obtained model output, and then determining how to modify the values of some or all parameters of the model to reduce the difference between the expected model output and the obtained model output.